Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread jdow

Google for it. You may have to look for the very earliest version you
can find. And you will not be able to find any help for running it.
We've all forgotten those bad old days.

You got better advice from someone else directing you to a special
interest mailing list for your machine regarding upgrading perl. I am
willing to bet that 2.6x will not run on anything earlier than 5.5 perl.
So you'll have to go back farther and farther into prehistorical times.
Good luck if you persist on this path. I can't help you. (Nor would my
conscience allow me to if I still retained the knowledge.)

{^_^}
- Original Message - 
From: Atami Org. [EMAIL PROTECTED]




Dear Loren and jdow;
Thank you very much your replay.
Where url do I get 2.64 ?
Eiji Hamano


 Please advice me the old version of SpamAssassin which can be installed
 by perl 5.005. Where is the url ?

Hum, I wonder if 2.64 would run on 5.005?  I don't recall anymore.

I would not consider going back farther than 2.63/2.64.

If those won't run on 5.005, then I would consider installing Perl in a
separate directory path from the system Perl, and point SA at the local
install.  I think you can do this with CPAN install stuff.  HOwever, I 
have

never done it myself, so can't tell you how.  Others likely can.

Loren
 





Re: Spam-Status tag with score numbers?

2005-08-15 Thread Simon Oosthoek

Matt Kettler wrote:

Simon Oosthoek wrote:

Matt Kettler wrote:
Ok, I've put this in the /etc/spamassassin/local.cf file and it doesn't
change the appearance of the Status header at all :-(

I'm not using spamd, but amavisd-new which calls spamassassin directly I
think using the perl-libs...


Hmm.. a few points here:

1) adding that to local.cf you'll need to do a clear_headers command first to
clear out the default header profile. Sorry for the oversight.

The full commands to add to local.cf would be:

clear_headers
add_header spam Flag _YESNOCAPS_
add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTSSCORES_
autolearn=_AUTOLEARN_ version=_VERSION_
add_header all Level _STARS(*)_
add_header all Checker-Version SpamAssassin _VERSION_ (_SUBVERSION_) on 
_HOSTNAME_


I tried this, but nothing changed, certainly not the tests with scores 
as I'd like to have. I'm starting to think your suggestion #3 is maybe 
the culprit, I'll see if I can ask the question on the amavis list...




2) you'll probably need to restart amavis to pick up the new local.cf. Amavis,
mailscanner, and other perl API callers wind up caching one or more
Mail::SpamAssassin instances. This has a similar effect to using spamd, but it's
all in the same memory space instead of separated by a socket.

3) depending on how you have amavis set, it might generate it's own markups. In
partiuclar the fast spamassassin option comes to mind, but I'm not an amavis
user..


Cheers

Simon

--
phone:(+31|0)53 4810319
fax:  (+31|0)53 4810333
[EMAIL PROTECTED]
http://www.ti-wmc.nl/


Very long scan times - Finding the culprit rule

2005-08-15 Thread Paul J. Smith
 Hi,

We are currently seeing scan times of 60-90 seconds on a P4 3Ghz box
after adding some new rules emporium rules to try to increase the
effectiveness of spamassassin.

Is there a way to list the timing for each test rather that the total
scan time so I can see which parts are taking significant time and drop
them?

Thanks.


Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread Atami Org.
Dear jdow;
I am not clear your English. Because I have less knowledge 
for both English and SpamAssassin.
However NOW I understand that noone have old versions of 
SpamAssassin as pablic. And NOW I understand that I have to search it myself.
Thank you again jdow;  I try to do it !
Eiji Hamano


 Google for it. You may have to look for the very earliest version you
 can find. And you will not be able to find any help for running it.
 We've all forgotten those bad old days.
 
 You got better advice from someone else directing you to a special
 interest mailing list for your machine regarding upgrading perl. I am
 willing to bet that 2.6x will not run on anything earlier than 5.5 perl.
 So you'll have to go back farther and farther into prehistorical times.
 Good luck if you persist on this path. I can't help you. (Nor would my
 conscience allow me to if I still retained the knowledge.)
 
 {^_^}
 - Original Message - 
 From: Atami Org. [EMAIL PROTECTED]
 
 
  Dear Loren and jdow;
  Thank you very much your replay.
  Where url do I get 2.64 ?
  Eiji Hamano
 
 
   Please advice me the old version of SpamAssassin which can be installed
   by perl 5.005. Where is the url ?
  
  Hum, I wonder if 2.64 would run on 5.005?  I don't recall anymore.
  
  I would not consider going back farther than 2.63/2.64.
  
  If those won't run on 5.005, then I would consider installing Perl in a
  separate directory path from the system Perl, and point SA at the local
  install.  I think you can do this with CPAN install stuff.  HOwever, I 
  have
  never done it myself, so can't tell you how.  Others likely can.
  
  Loren
   
 
 



Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread jdow

Candidate rules right off the bat are DNS based if you are seeing
long delays. You probably have a half dozen or more DNS based rules
setup and DNS is not working.

{^_^}
- Original Message - 
From: Paul J. Smith [EMAIL PROTECTED]



Hi,

We are currently seeing scan times of 60-90 seconds on a P4 3Ghz box
after adding some new rules emporium rules to try to increase the
effectiveness of spamassassin.

Is there a way to list the timing for each test rather that the total
scan time so I can see which parts are taking significant time and drop
them?

Thanks.



RE: Very long scan times - Finding the culprit rule

2005-08-15 Thread Paul J. Smith

Hi,

DNS is working fine.  We've been running SA for 6 months no problem,
it's only when we added the extra 10 rule sets it got bogged down.  I've
just been removing them one by one at the moment and have got the timing
back down to 6 secs or so, but it would be very handy to have the actual
times of each test logged so I can see which are the slow ones.
 

-Original Message-
From: jdow [mailto:[EMAIL PROTECTED] 
Sent: 15 August 2005 09:38
To: users@spamassassin.apache.org
Subject: Re: Very long scan times - Finding the culprit rule

Candidate rules right off the bat are DNS based if you are seeing
long delays. You probably have a half dozen or more DNS based rules
setup and DNS is not working.

{^_^}
- Original Message - 
From: Paul J. Smith [EMAIL PROTECTED]


Hi,

We are currently seeing scan times of 60-90 seconds on a P4 3Ghz box
after adding some new rules emporium rules to try to increase the
effectiveness of spamassassin.

Is there a way to list the timing for each test rather that the total
scan time so I can see which parts are taking significant time and drop
them?

Thanks.




Rules Emporium - what's been incorporated in 3.1.0?

2005-08-15 Thread Brian Morrison
As subject, I use a fair number of Rules Emporium rules, is there any
information about on which of those rules have made it into the 3.1.0 rule set?

Thanks

-- 

Brian




Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread jdow

As I replied directly, Can't be done.

The time rule 1022 takes depends on all the other 2752 rules you are
running. Changing any one of them changes memory requirements. And if
it is not DNS then you are running into swap memory and will experience
HEAVY slow downs. Reduce the number of concurrent spamd's running to
reduce the memory footprint. Trying to isolate the time for a single
rule can drive you to the funny farm quickly. This is especially true
when you are shy of required memory to keep SpamAssassin completely
in active memory rather than having it swap pieces in and out.

{^_^}
- Original Message - 
From: Paul J. Smith [EMAIL PROTECTED]

To: jdow [EMAIL PROTECTED]; users@spamassassin.apache.org
Sent: 2005 August, 15, Monday 01:45
Subject: RE: Very long scan times - Finding the culprit rule



Hi,

DNS is working fine.  We've been running SA for 6 months no problem,
it's only when we added the extra 10 rule sets it got bogged down.  I've
just been removing them one by one at the moment and have got the timing
back down to 6 secs or so, but it would be very handy to have the actual
times of each test logged so I can see which are the slow ones.


-Original Message-
From: jdow [mailto:[EMAIL PROTECTED] 
Sent: 15 August 2005 09:38

To: users@spamassassin.apache.org
Subject: Re: Very long scan times - Finding the culprit rule

Candidate rules right off the bat are DNS based if you are seeing
long delays. You probably have a half dozen or more DNS based rules
setup and DNS is not working.

{^_^}
- Original Message - 
From: Paul J. Smith [EMAIL PROTECTED]



Hi,

We are currently seeing scan times of 60-90 seconds on a P4 3Ghz box
after adding some new rules emporium rules to try to increase the
effectiveness of spamassassin.

Is there a way to list the timing for each test rather that the total
scan time so I can see which parts are taking significant time and drop
them?

Thanks.




Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Loren Wilton
You can run DProf manually on SA and see what it says about rule timings.
Or at least you are supposed to be able to; the last time I tried it I
couldn't get it to work.

However, there may be a simpler answer.  You didn't mention the amount of
ram you have nor the number of children you are running.

My bet is that the extra rules have increased the size of your spamd
children, and you have enough of them that you are now thrashing.

If you added every SARE rule file (or at least every one you are supposed to
add for 3.x) then you have probably doubled the size of your spamd children.
This means that they have gone from 30-40MB each to probably 40-60MB each,
or possibly a tad larger.  If you had 5 children and 512MB, you probably
broke the bank.

Answer would be to FIRST make sure that you only have the rules files that
you should have for whichever version of SA you are running.  (Far too many
people just grab everything, seemingly without noticing that some files are
only for certain versions of SA.)

Next, check the child sizes and available memory.  Consider cutting back on
some rules files, or the number of children; or adding a stick of memory.
:-)

Loren



Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread Loren Wilton
 However NOW I understand that noone have old versions of
 SpamAssassin as pablic.

There should be a version of 2.64 available publicly on the net someplace.
I would expect on the SA site someplace.  It was the last version before
3.0.

However, I am not sure that 2.64 will work with 5.005.  You will have to
check that.

I think your best course would be to go to the mailing list someone pointed
you to that discusses upgrading RaQ servers, and talk to them about
upgrading to a newer Perl.

Loren



Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Loren Wilton
 back down to 6 secs or so, but it would be very handy to have the actual
 times of each test logged so I can see which are the slow ones.

Check Top.  This sounds a lot like you are thrashing.  The rulesemporium
rules are fairly carefully written to not be processor hogs, although we
have made mistakes in the past.  They have typically only showed up when
someone had some very strangely formatted mail messages.  Now you might
indeed have hit some slow or badly composed rules; but I'm betting you are
just out of memory.

There is a Wiki page by Daniel or Justin or some such on profiling rules.
Should be fairly easy to find, and I seem to recall the steps were fairly
simple, except they didn't work for me.  :-(

Loren



Re: Rules Emporium - what's been incorporated in 3.1.0?

2005-08-15 Thread Loren Wilton
Hum.  Interesting question.  Thinking back, I don't believe that any entire
ruleset, or even any major hunk of a ruleset, moved into 3.1 that wasn't
already in 3.0.  There has been some rule migration, but it has largely been
piecemeal.

We will probably have to run an overlap check with the 3.1 rules and do
something like make a 3.1scores.cf or some such to zero out any rules that
overlap too much.

Loren



How to use Multilog ?

2005-08-15 Thread Dhanny Kosasih
I use SpamAssassin 3.0.4 with FC3, and i use script from Fedora to start 
and stop spamd. But qmailmrtg7 can't read log with standard 
SpamAssassin. The qmailmrtg can only read log with multilog format. I 
try --syslog=stderr but i don't know where is the log file ? How 
SpamAssassin use multilog format ?


Regards,
Dhanny Kosasih.


___ 
How much free photo storage do you get? Store your holiday 
snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com


RE: Question on NO_DNS_FOR_FROM Rule

2005-08-15 Thread Ronald I. Nutter
Thanks.  Will have to see how to do this with postfix.

Ron


Ron Nutter  [EMAIL PROTECTED] 
Network Infrastructure  Security Manager
Information Technology Services(502)863-7002
Georgetown College 
Georgetown, KY40324-1696

 

-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, August 12, 2005 3:45 PM
To: Ronald I. Nutter
Cc: users@spamassassin.apache.org
Subject: Re: Question on NO_DNS_FOR_FROM Rule


Ronald I. Nutter wrote:
 I am getting quite a bit of spam coming in today that is scoring well 
 below the 5.0 min (i.e. 2.4 or so out of 5.0).  The common thread I am

 seeing is that they all fail the NO_DNS_FOR_FROM Rule.  I noticed that

 it is only set to 1.1.  I am thinking about raising the value of this 
 score.  I don't think that I should loose any email, should I  ?  
 Guess I have a simplistic view of the world that no legitimate company

 would run a mail server without a valid A and MX record.
 
 Thoughts ?

Their DNS, or your internet connection, could be down at the time you
scan mail.

Thus, this test could occasionally hit for legitimate companies with a
proper MX record if either their end end has flaky DNS hosting.

Generally, I find it infinitely better to check this at SMTP time.

i.e. turn OFF accept_unresolvable_domains in sendmail.cf.

This way you temp-fail messages with unresolvable return paths, and they
eventually deliver when the source domains DNS is resolvable again.

No legitimate mail should ever have an unresolvable envelope return path
(except the NULL return path, which isn't counted as unresolvable by
sendmail).

I know *I* certainly don't want such a message in my mail queues. Any
DSN that might get generated is certain to end up stuck in my mail
queue, then eventually double-bounce into postmaster's box. Ick.


filter for subjects

2005-08-15 Thread Fettke, Dirk
Title: filter for subjects






Hi,


I become desperate I want any mail with specific subject (like: viagra postbank, Adobe) mark as spam. The Mail should be dropped and not delivered to

Mailbox. 

Our Mailserver is only for relaying and filtering for spam and viruses. So there are no local mailboxes.

Is there a possibility to do this with spamassassin?

It can't be so difficult, or?


Thanks

Dirk





Re: filter for subjects

2005-08-15 Thread Matt Kettler

At 08:19 AM 8/15/2005, Fettke, Dirk wrote:
I become desperate… I want any mail with specific subject (like: viagra 
postbank, Adobe…) mark as spam.


Ok, SA can be made to do that.. it's a little less straightforward than 
just saying block subject xyz but it's not hard.


A short rule with a high score will do the trick:

header BANNED_SUB1  subject ~= /Banned subject text 1/i
score BANNED_SUB1   100



The Mail should be dropped and not delivered to Mailbox.


That is a trick SA itself can't do. It can't delete mail. HOWEVER, most of 
the tools that call SA can do this.


Our Mailserver is only for relaying and filtering for spam and viruses. So 
there are no local mailboxes.


Ok, so you want to do this at the MTA layer.. amavis or mailscanner would 
be good choices. Both can be told to delete spam mail over a certain score 
level.



Is there a possibility to do this with spamassassin?



When combined with other tools, yes, although something as simple as just 
blocking a specific subject is a job that spamassassin is overkill for.


If you only need subject blocking, and you don't need robust spam scanning, 
you might want to look at something less powerful like milter-regex instead.







Re: How to use Multilog ?

2005-08-15 Thread Matt Kettler

At 07:45 AM 8/15/2005, Dhanny Kosasih wrote:
I use SpamAssassin 3.0.4 with FC3, and i use script from Fedora to start 
and stop spamd. But qmailmrtg7 can't read log with standard SpamAssassin. 
The qmailmrtg can only read log with multilog format.





I try --syslog=stderr but i don't know where is the log file ?


Well, if spamd is daemonized, stderr goes to /dev/null unless you redirect 
it somewhere.


Perhaps you want something like:

spamd -s stdout | multilog {insert multilog options here}

Or if you want it to just directly log to to a file:

spamd -s /var/log/spamassassin.log

And you can crunch that through multilog later..





How SpamAssassin use multilog format ?




Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Matt Kettler

At 04:12 AM 8/15/2005, Paul J. Smith wrote:

We are currently seeing scan times of 60-90 seconds on a P4 3Ghz box
after adding some new rules emporium rules to try to increase the
effectiveness of spamassassin.

Is there a way to list the timing for each test rather that the total
scan time so I can see which parts are taking significant time and drop
them?



No, but you can narrow it down a bit.

If you run a message through spamassassin -D you can see it tell you when 
it is running header rules, body rules, etc. You can use this information 
to a least know what type of rule you're looking for.


Then you can start pulling SARE files out one at a time till the scan time 
drops.


I would also double-check your memory footprint. If your spamds are all 
really large (60m) then you should look at the SARE rulesets and see which 
one is large on disk and remove it. In general be a little warry of a rules 
file that's over 256k or so. (bigevil in particular)





RE: Question on NO_DNS_FOR_FROM Rule

2005-08-15 Thread List Mail User
...

Thanks.  Will have to see how to do this with postfix.

Ron


Ron Nutter  [EMAIL PROTECTED] 
Network Infrastructure  Security Manager
Information Technology Services(502)863-7002
Georgetown College 
Georgetown, KY40324-1696

 
[snipped]

Try:

smtpd_sender_restrictions =
reject_non_fqdn_sender,
reject_unknown_sender_domain,

and

smtpd_helo_restrictions =
reject_non_fqdn_hostname,
reject_invalid_hostname,
reject_unknown_hostname,

and

smtpd_client_restrictions =
reject_unknown_hostname,
reject_unknown_client,

Plus whatever you're already doing.  (Warning: Lots of MS boxes
will be misconfigured and get refused - I consider this a good thing.)
YMMV.  I find these few rules cause about half of all connections to be
refused (re. half of all spam goes away).  Also AFAIK, these all return
a 450 code, so transient errors (i.e. overloaded DNS servers) are not fatal.

If you want to relax anything, it might be the helo restrictions,
because there are *so* many misconfigured Exchange boxes out there.  And
the client clauses effectively enforce rDNS - so you might not want them
either (or qualify them with warn_if_reject which will log a warning, but
allow the transaction to continue).

The documentation covers all of this.

Paul Shupak
[EMAIL PROTECTED]

P.S.  These are just the relevant clauses - I have them interspersed with
*many* other restrictions, access lists, etc..



Re: How to use Multilog ?

2005-08-15 Thread Chr. v. Stuckrad
On Mon, Aug 15, 2005 at 09:09:20AM -0400, Matt Kettler wrote:
 Perhaps you want something like:
 
 spamd -s stdout | multilog {insert multilog options here}

This should be exactly what you want.
BUT in the manual I only see 'stderr' allowed
for '... -s stderr'.  If 'stdout' does not work
you might need to run

 /bin/sh -c 'exec spamassassin -s stderr ... ... ... 21'
instead of
 'spamassassin -s stdout ... ... ...'
This way you'll get stderr redirected to stdout by the shell,
and multilog gets the output.

Multilog (normally started by Bernsteins Daemontools
via supervisor) analyses standard input!

See: http://cr.yp.to/daemontools/multilog.html

Stucki

-- 
Christoph von Stuckrad  * * |nickname |[EMAIL PROTECTED]  \
Freie Universitaet Berlin   |/_*|'stucki' |Tel(days):+49 30 838-75 459|
Mathematik  Informatik EDV |\ *|if online|Tel(else):+49 30 77 39 6600|
Arnimallee 2-6/14195 Berlin * * |on IRCnet|Fax(alle):+49 30 838-75454/


Re: filter for subjects

2005-08-15 Thread hamann . w
 
 Hi,
 
 I become desperate... I want any mail with specific subject (like:
 viagra postbank, Adobe...) mark as spam. The Mail should be dropped and
 not delivered to
 Mailbox.=20
 Our Mailserver is only for relaying and filtering for spam and viruses.
 So there are no local mailboxes.
 Is there a possibility to do this with spamassassin?
 It can't be so difficult, or?
 
 Thanks
 Dirk
 

Hi Dirk,

sure spamassassin will help you  it will even detect these nasty messages 
when the
sender decided to move the offending words into the mail message

You did not say what kind of mail server software you are using

Wolfgang Hamann



Re: How to use Multilog ?

2005-08-15 Thread hamann . w


Hi,

the simplest way to use multilog is the way its author designed it :)
Rather than the start/stop script you are familiar with, setup daemontools to 
run the spamd
as a service and pass it option to NOT daemonize.
This will take care of the logging, and will also restart it should it ever die.
It will, however, try VERY HARD to run it if any problem prevents it from 
running

WOlfgang Hamann



AW: filter for subjects

2005-08-15 Thread Fettke, Dirk
Sorry, I forgot.

I'm using postfix, amavisd, spamassassin

In my local.cf I have insert these lines like Matt Kettler told me.

header BANNED_SUB1  Subject =~ /viagra/i
score BANNED_SUB1   100

Unfortunately it doesn't work. When i write myself an email from gmx with the 
subject viagra it will be delivered to my mailbox.

Any other ideas?
Greetings
Dirk



-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
Gesendet: Montag, 15. August 2005 15:41
An: users@spamassassin.apache.org
Cc: Fettke, Dirk
Betreff: Re: filter for subjects 

 
 Hi,
 
 I become desperate... I want any mail with specific subject (like:
 viagra postbank, Adobe...) mark as spam. The Mail should be dropped and
 not delivered to
 Mailbox.=20
 Our Mailserver is only for relaying and filtering for spam and viruses.
 So there are no local mailboxes.
 Is there a possibility to do this with spamassassin?
 It can't be so difficult, or?
 
 Thanks
 Dirk
 

Hi Dirk,

sure spamassassin will help you  it will even detect these nasty messages 
when the
sender decided to move the offending words into the mail message

You did not say what kind of mail server software you are using

Wolfgang Hamann



Changed maillog entries in 3.1.0-rc1?

2005-08-15 Thread Ed Kasky
Testing 3.1.0-rc1 on a RH 7.2 machine with sendmail, I noticed this morning 
that the log entires into maillog have changed from:


Aug  7 03:48:54 yoda2 spamd[11492]: identified spam (13.0/6.9) for 
spamd:1205 in 2.9 seconds, 3420 bytes.


to:

Aug 15 06:23:55 yoda2 spamassassin[10790]: spamd: identified spam 
(49.1/6.9) for spamd:1205 in 2.9 seconds, 2101 bytes.


Is this intentional?  I only ask as it has an effect on a statistics script 
I run daily that is expecting the former and this morning reported all 
zeros


Ed

. . . . . . . . . . . . . . . . . .
Randomly Generated Quote (429 of 997):
Cogito cogito ergo cogito sum --
I think that I think, therefore I think that I am.
-- Ambrose Bierce, The Devil's Dictionary



Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread jdow

As soon as you touch swap space you're dead. It's not unusual to see times
for processes increase by 10 or even 100 times. (Although about 10 is most
common.)

{^_^}
- Original Message - 
From: Paul J. Smith [EMAIL PROTECTED]



Thanks all.

I did check 'top' and did increase the memory to 512mb.  It's the latest
ver of SA so I think it's running only 5 processes.  It rarely seems to
dip into the swap space, though it does report all the memory as being
used so I was assuming it was merely processor time.  The CPU was
certainly maxed out most of the time.  I've removed most of the
additional rules and it's back down to 5-6 seconds now, but I'll add
them back one by one with a test period in between to see if I can
pinpoint what was happening.

I really want this to scale way up from where we are, so I guess a
reasonable machine with just loads of ram is the answer?


-Original Message-
From: Loren Wilton [mailto:[EMAIL PROTECTED] 

back down to 6 secs or so, but it would be very handy to have 

the actual

times of each test logged so I can see which are the slow ones.


Check Top.  This sounds a lot like you are thrashing.  The rulesemporium
rules are fairly carefully written to not be processor hogs, although we
have made mistakes in the past.  They have typically only showed up when
someone had some very strangely formatted mail messages.  Now you might
indeed have hit some slow or badly composed rules; but I'm 
betting you are

just out of memory.

There is a Wiki page by Daniel or Justin or some such on 
profiling rules.
Should be fairly easy to find, and I seem to recall the steps 
were fairly

simple, except they didn't work for me.  :-(

   Loren




Re: filter for subjects

2005-08-15 Thread jdow

1) You can use SARE rules to increase scores for words like viagra.
2) You cannot under any circumstance have SpamAssassin not pass mail
  on to the next delivery step. It is possible to have the next
  delivery step drop the mail into /dev/null.
3) It is not wise to get too frantic and drop things just because they
  say Adobe. Some might be legitimate. (That is not a usual spam word
  here, at least.)

{^_^}
- Original Message - 
From: Fettke, Dirk [EMAIL PROTECTED]



Hi,

I become desperate... I want any mail with specific subject (like:
viagra postbank, Adobe...) mark as spam. The Mail should be dropped and
not delivered to
Mailbox. 
Our Mailserver is only for relaying and filtering for spam and viruses.

So there are no local mailboxes.
Is there a possibility to do this with spamassassin?
It can't be so difficult, or?

Thanks
Dirk




Bonded Sender

2005-08-15 Thread Russ Uhte
We're moving away from our current antispam setup which uses the bonded 
sender list.  In doing some checking to see how I want to setup SA, I 
noticed that currently many messages that look like spam are being 
whitelisted by our current setup because of the bonded sender list.


What is the basic feeling of Bonded Sender in the SA world?  I'm 
thinking it's crap...


Thanks,
Russ
---
[This E-mail scanned for viruses by Declude Virus]



Re: AW: filter for subjects

2005-08-15 Thread Duncan Hill
On Monday 15 August 2005 14:44, Fettke, Dirk typed:
 Sorry, I forgot.

 I'm using postfix, amavisd, spamassassin

 In my local.cf I have insert these lines like Matt Kettler told me.

 header BANNED_SUB1  Subject =~ /viagra/i
 score BANNED_SUB1   100

 Unfortunately it doesn't work. When i write myself an email from gmx with
 the subject viagra it will be delivered to my mailbox.

As other posters have said - SA is merely a tagging/scoring filter.  It 
optionally tags mail based on criteria you set.  It is up to your MDA or MUA 
to decide what to do based on the tags.

Ie, in Amavisd-new, you can define the kill level and inform amavisd that mail 
over kill should go to quarantine, and quarantine only.  Questions regarding 
that, however, are questions for the Amavisd-new documentation or list.


Re: Bonded Sender

2005-08-15 Thread Martin Hepworth

Russ Uhte wrote:
We're moving away from our current antispam setup which uses the bonded 
sender list.  In doing some checking to see how I want to setup SA, I 
noticed that currently many messages that look like spam are being 
whitelisted by our current setup because of the bonded sender list.


What is the basic feeling of Bonded Sender in the SA world?  I'm 
thinking it's crap...


Thanks,
Russ
---
[This E-mail scanned for viruses by Declude Virus]



Depends on how good the Bonding co. are at responding to abuse. From 
what I've seen most aren't that good (but theres always the exception).


--
--
Martin Hepworth
Senior Systems Administrator
Solid State Logic Ltd
tel: +44 (0)1865 842300

**

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.   

**



Re: filter for subjects

2005-08-15 Thread Loren Wilton
 I'm using postfix, amavisd, spamassassin

 In my local.cf I have insert these lines like Matt Kettler told me.

 Unfortunately it doesn't work. When i write myself an email from gmx with
the
 subject viagra it will be delivered to my mailbox.

 Any other ideas?

You need to do the second half of what Matt told you to do.

Spamassassin only assigns scores to the mail.  It doesn't route mail.

Something else in your configuration, which would have to be postfix or
amvis, will have to look at the score that SA has assigned to the mail and
decide, based on the score, to drop the mail.

I don't know either of those programs so I can't help you on exactly how,
but I'm sure someone around here will know.

Also, it is possible that you might have to restart your mail system after
changing rules before they will take effect.  It would depend on how amvis
runs SA.

BTW, in the regex you tried, instead of just using /viagra/, you would be
better off using /\bviagra\b/.  This will make sure that the word you are
testing isn't part of a larger word.  Probably doesn't matter much with this
particular word, but there are a lot of 'bad words' that are part of good
words, and if you just check for the 'bad' word without boundaries, you end
up kicking out perfectly good mail.  This is especially important when you
are throwing mail away rather than just marking it as spam.

Also BTW, SA has lots of rules to catch the sort of work you were looking
for, including most all of the obfuscation variants that spammers use.  If
you aren't familiar with writing rules for SA, you might be better off using
SA's standard rules (possibly with modified scores) and picking up some more
from RulesEmporium.

Loren



Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Chr. v. Stuckrad
On Mon, Aug 15, 2005 at 06:51:48AM -0700, jdow wrote:
 As soon as you touch swap space you're dead. It's not unusual to see times
 for processes increase by 10 or even 100 times. (Although about 10 is most
 common.)

Happened to us already twice.  Is seems to hit 'just by chance'.

I assume it to be a 'bunch of too many large mails' hitting
'complicated rules' (especially rules with 'variably long'
patterns like '.{1,30}'), and so bloating up *all* children of
spamd in parallel.  Normally only one or two are bloated
and they 'die soon' being replaced by normally sized ones,
but extremely seldon *all* bloat, and the server goes down.

Stucki


Re: Bonded Sender

2005-08-15 Thread Loren Wilton
My very minimal experience with Bonded Sender is that the people who
contract directly are mostly fairly legit.  The people who contract through
the clever guilt-sharing arrangement at constant contact are spammers.

Be aware though that MANY spammers forge bonded sender tags.  If you have
one of the older methods of checking bonded sender, it is very probable that
a lot of your failures are forgeries that the newer bonded sender methods
should correctly detect.

I would not go so far as to say bonded sender is crap.  I would however say
that it is of fairly minimal usefulness in detecting whether a message is
spam.  The SURBL list, for instance, is far, far, better.

Loren



Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Loren Wilton
Have you changed --max-con-per-child?  Usually a sudden bloat in a single
child is due to:

aRunning a Bayes expire in that child
bRunning an Awl expire
cProcessing a message that is very large

You can stop the first two from being problems by running a manual expire
from a cron job every so often and disabling the auto-expire runs.  You
should have a limit of 250K or so on the mail size to try to keep the third
from being a problem.

Usually (at least in my experience) the way a rule is written doesn't affect
the spamd memory size.  That is mostly determined by the number of rules to
get a reasonably static size, and then by things like expire runs or very
large messages to cause abnormal bloat.

If a lot of children get fat, you have a problem, of course.  Making sure
that they don't process a large number of mails before dying and restarting
is one way around this problem.

Loren



Re: ANNOUNCE: SpamAssassin 3.1.0-rc1 release candidate available!

2005-08-15 Thread Kenneth Porter
--On Saturday, August 13, 2005 6:58 PM -0400 Theo Van Dinter 
[EMAIL PROTECTED] wrote:



On Sat, Aug 13, 2005 at 03:07:14PM +0530, Ramprasad A Padmanabhan wrote:

When I build the rpm from the spec file ( on fedora core 3 ) the
spamassassin-tools rpm is not created. Was it not a part of SA.


The tools RPM was deprecated.  There was very little in there that wasn't
development related, which is better taken out of SVN or the tarball,
so ...


I'd recommend adding an Obsoletes tag for the deprecated subpackage, then. 
Otherwise the 3.0.4 subpackage gets orphaned and blocks updating of the 
surviving subpackages.





RE: filter for subjects

2005-08-15 Thread Herb Martin
 -Original Message-
 From: jdow [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 15, 2005 8:55 AM
 To: users@spamassassin.apache.org
 Subject: Re: filter for subjects
 
 1) You can use SARE rules to increase scores for words like viagra.
 2) You cannot under any circumstance have SpamAssassin not pass mail
on to the next delivery step. It is possible to have the next
delivery step drop the mail into /dev/null.
 3) It is not wise to get too frantic and drop things just because they
say Adobe. Some might be legitimate. (That is not a 
 usual spam word
here, at least.)

Agreeing and elaborating on this and some of the
other sugguestions...

SA drops NOTHING -- SA scores the spam or ham so
that THE ADMINISTRATOR or USER can decide what to do with
it.  Such decisions belong to the administrator and
the recipient of the email.

Some admins send the mail through leaving the entire 
decision to the User/recipient and some use various
criteria to reject, bounce (generally bad these days),
or save (some) of the likely spam for review.

We bounce nothing, but we do reject using SpamAssassin
this way using Exim MTA (other MTA can do something 
similar):

1) All spam is held for review (we have spam
down to such a small amount this is easy)
if it passes the next step.

2) If the score meets a superspam threshold
we use an Exim ACL (during Data time before
the email is accepted) to check subjects and
a few other such criteria (sender etc.)

Since SA has already marked the email as
seriously likely to be spam these checks
can be a bit looser than they would be if
the message were random.

Using the Adobe-subject example above:  If the message
contains 2 of:  Microsoft Adobe Macromedia Corel AND
is SuperSpam it is droppabable but this wouldn't be
possible if a legitimate news message subject had
something like Adobe sues Microsoft or Corel partners
with Macromedia.  (This is just an example and a more
conservative filter could say, three of these words
but that is up to the admin etc.)

Our spam went down to almost nothing by using Greylisting
in a reduced manner:  We avoid almost all of the problems 
associated with Greylisting by only using it for messages
that are already 'suspicious' (i.e., things many people
will use to REFUSE mail, are used by our Exim ACLs to drive
the message through Greylisting.  93% of these message
are never re-tried.  So far no good mail has been identified
as being dropped and practically no real mail is even delayed.

['Suspicion' Checks include: Header checks, valid reverse, 
valid Helo vs. reverse host name, SPF, dynamic host name
or certain country code patterns, and membership on blacklists, 
including some very agressive lists since no list can actually
block the email.]

SpamAssassin never sees mail unless the other checks including
greylisting of suspicious messages pass them through.

If a message passes to SpamAssassin and is checked against
the simple subject etc. filters and not dropped it is STILL
driven through Greylisting if that has not already been done for
this message's Helo/From/To triplet.

This defense in depth is knocking spam down to a trickle
AT THE SERVER, and practically nothing* is getting through
to users with no complaints of missing mail or evidence of
such in the logs.

We are still manually reviewing the Spam trapped at the
server.

Nothing bounces.  Very little spam is ever accepted.

And 95% of the Spam we trap is scores above 25 points.
Almost none is scored below 15 points.

We have practically none in the trough between Spam
and Ham -- it is all classifying cleanly which really
lets SpamAssassin shine.


--
Herb Martin



Re: Bonded Sender

2005-08-15 Thread Matt Kettler

At 10:18 AM 8/15/2005, Loren Wilton wrote:

My very minimal experience with Bonded Sender is that the people who
contract directly are mostly fairly legit.  The people who contract through
the clever guilt-sharing arrangement at constant contact are spammers.


Agreed.



Be aware though that MANY spammers forge bonded sender tags.  If you have
one of the older methods of checking bonded sender, it is very probable that
a lot of your failures are forgeries that the newer bonded sender methods
should correctly detect.


Erm, you're thinking of HABEAS SWE. Bonded sender doesn't have a tag in the 
headers, so there's nothing to forge.


Bondedsender is based on your IP address.. Bonded sender works like a 
DNSBL, but is a DNSWL (DNS white list).


If BSP_TRUSTED hits in SA one of two things is true:

1) The server delivering mail has a bond, and you can complain and cost the 
owner of the server money against the bond.


2) Your trusted_networks isn't set properly, usually due to having a NATed 
mailserver, or some other arrangement where the first internet routable, 
non-reserved, IP in the headers isn't your server. This causes SA to trust 
one more header than it should, and spammers can insert a forged Received: 
header that SA will honor for this test that it shouldn't.



Most people having problems with BSP are in category 2, or consider 
subscriber mail to be spam. (There is a lot of spam-ish subscriber mail out 
there, my users subscribe to lots of it, on purpose, it's often hard for me 
to tell without asking the recipient. I also have users that claim that 
amazon mail is spam, even though they bought items there and didn't clear 
the send me special offers check box.)



Of course, there are some real spammers using servers with real bonds... 
Start reporting them to bondedsender, the costs will eventually cause them 
to cancel the bond.


This goes double for contract-thru arrangements. The cost of the complaint 
goes against the bond, which will encourage the bonds owner to reduce spam 
volume to reduce their costs. If the money in the bond runs out, their BSP 
listing goes away. Although BSP might let them put more money in, you're at 
least incurring a direct cost to the sender of spam.




I would not go so far as to say bonded sender is crap.  I would however say
that it is of fairly minimal usefulness in detecting whether a message is
spam.  The SURBL list, for instance, is far, far, better.


Well, it's *completely* useless at detecting if a message is spam. So as a 
primary basis of a spam filter, I agree, it's useless.


BSP only tells you if the sending server has a bond, so it's only useful in 
telling you if the message is less likely to be spam.


BSP has no implications that would indicate spam any message. And it 
doesn't even tell you the message isn't spam, it only tells you the server 
owner is putting his money where his mouth is.




Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Chr. v. Stuckrad
On Mon, Aug 15, 2005 at 07:27:33AM -0700, Loren Wilton wrote:
 You can stop the first two from being problems by running a manual expire
 from a cron job every so often and disabling the auto-expire runs.  You
 should have a limit of 250K or so on the mail size to try to keep the third
 from being a problem.

Did that, it works (mostly, see below)...

 Usually (at least in my experience) the way a rule is written doesn't affect
 the spamd memory size.

Sorry, this is definitely WRONG!

If you write (like I once did) some rule containing spurious
'arbitrary long ..*-Constructs', the regex-automaton goes crazy
and a mail of 250k may need more than 250MByte memory per child,
instead of the currently seen near 80M.

Simply 'shortening' the possible evaluation of the expression by
replacing '..*' by .{1,N} (with 'N' a 'reasonably short' number)
shrunk the problem to manageable sizes!

Since then I never again used .+ or .* but ALWAYS limit the length.

Stucki


Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread Stuart Johnston
As stated at the bottom of the downloads page on spamassassin.org, 
Older, Perl-licensed versions can be found via CPAN.


http://search.cpan.org/CPAN/authors/id/J/JM/JMASON/Mail-SpamAssassin-2.64.tar.gz


Atami Org. wrote:

Dear jdow;
I am not clear your English. Because I have less knowledge 
for both English and SpamAssassin.
However NOW I understand that noone have old versions of 
SpamAssassin as pablic. And NOW I understand that I have to search it myself.
Thank you again jdow;  I try to do it !
Eiji Hamano



 Google for it. You may have to look for the very earliest version you
 can find. And you will not be able to find any help for running it.
 We've all forgotten those bad old days.
 
 You got better advice from someone else directing you to a special
 interest mailing list for your machine regarding upgrading perl. I am
 willing to bet that 2.6x will not run on anything earlier than 5.5 perl.
 So you'll have to go back farther and farther into prehistorical times.
 Good luck if you persist on this path. I can't help you. (Nor would my
 conscience allow me to if I still retained the knowledge.)
 
 {^_^}
 - Original Message - 
 From: Atami Org. [EMAIL PROTECTED]

 
 
  Dear Loren and jdow;
  Thank you very much your replay.
  Where url do I get 2.64 ?
  Eiji Hamano
 
 
   Please advice me the old version of SpamAssassin which can be installed
   by perl 5.005. Where is the url ?
  
  Hum, I wonder if 2.64 would run on 5.005?  I don't recall anymore.
  
  I would not consider going back farther than 2.63/2.64.
  
  If those won't run on 5.005, then I would consider installing Perl in a
  separate directory path from the system Perl, and point SA at the local
  install.  I think you can do this with CPAN install stuff.  HOwever, I 
  have

  never done it myself, so can't tell you how.  Others likely can.
  
  Loren
   
 

 





Re: Bonded Sender

2005-08-15 Thread Loren Wilton
 Be aware though that MANY spammers forge bonded sender tags.  If you have
 one of the older methods of checking bonded sender, it is very probable
that
 a lot of your failures are forgeries that the newer bonded sender methods
 should correctly detect.

 Erm, you're thinking of HABEAS SWE. Bonded sender doesn't have a tag in
the
 headers, so there's nothing to forge.

Erp.  Yea.  Long night dealing with flash floods inside the house.

Loren



RE: Very long scan times - Finding the culprit rule

2005-08-15 Thread Herb Martin

 -Original Message-
 From: Paul J. Smith [mailto:[EMAIL PROTECTED] 
 DNS is working fine.  We've been running SA for 6 months no 
 problem, it's only when we added the extra 10 rule sets it 
 got bogged down.  I've just been removing them one by one at 
 the moment and have got the timing back down to 6 secs or so, 
 but it would be very handy to have the actual times of each 
 test logged so I can see which are the slow ones.

You didn't by any chance add one of the very large blacklist
rule sets did you?

I have two of these that run 1-2 MB and cannot run them without
SpamD getting unreliable (slow, sluggish etc.)

--
Herb Martin



Couldn't find a good delta atime

2005-08-15 Thread Chris Conn

Hello,

When I run the sa-learn --force-expire on a regular basis, I eventually 
run into this:


debug: bayes: expiry check keep size, 0.75 * max: 562500
debug: bayes: token count: 675802, final goal reduction size: 113302
debug: bayes: First pass?  Current: 1124120405, Last: 1124089241, atime: 
124581,

 count: 403471, newdelta: 443635, ratio: 3.56102275334946, period: 43200
debug: bayes: Can't use estimation method for expiry, something fishy, 
calculating optimal atime delta (first pass)

debug: bayes: expiry max exponent: 9
debug: bayes: atime token reduction
debug: bayes:   ===
debug: bayes: 43200 450137
debug: bayes: 86400 341434
debug: bayes: 172800135964
debug: bayes: 3456000
debug: bayes: 6912000
debug: bayes: 1382400   0
debug: bayes: 2764800   0
debug: bayes: 5529600   0
debug: bayes: 11059200  0
debug: bayes: 22118400  0
debug: bayes: couldn't find a good delta atime, need more token 
difference, skipping expire.

debug: Syncing complete.
debug: bayes: 18908 untie-ing
debug: bayes: 18908 untie-ing db_toks
debug: bayes: 18908 untie-ing db_seen
debug: bayes: files locked, now unlocking lock


Once it is in this state, I never can recover and have to zap the database.

What could cause this?

Thanks,

Chris


Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Loren Wilton writes:
  However NOW I understand that noone have old versions of
  SpamAssassin as pablic.
 
 There should be a version of 2.64 available publicly on the net someplace.
 I would expect on the SA site someplace.  It was the last version before
 3.0.
 
 However, I am not sure that 2.64 will work with 5.005.  You will have to
 check that.
 
 I think your best course would be to go to the mailing list someone pointed
 you to that discusses upgrading RaQ servers, and talk to them about
 upgrading to a newer Perl.

Yes, Loren's right.

None of the 2.6x series support 5.005.  I don't think the 2.5x series
did, either.  5.005 is *very* outdated by now

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFDANH/MJF5cimLx9ARAoIRAJ4u4y2j5lzG+dxRT9HXNWRCm40u0gCgidif
RKU+fsXeP3O6VlcriiWXwzE=
=SaWm
-END PGP SIGNATURE-



Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread Justin Mason
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Chr. v. Stuckrad writes:
 On Mon, Aug 15, 2005 at 07:27:33AM -0700, Loren Wilton wrote:
  You can stop the first two from being problems by running a manual expire
  from a cron job every so often and disabling the auto-expire runs.  You
  should have a limit of 250K or so on the mail size to try to keep the third
  from being a problem.
 
 Did that, it works (mostly, see below)...
 
  Usually (at least in my experience) the way a rule is written doesn't affect
  the spamd memory size.
 
 Sorry, this is definitely WRONG!
 
 If you write (like I once did) some rule containing spurious
 'arbitrary long ..*-Constructs', the regex-automaton goes crazy
 and a mail of 250k may need more than 250MByte memory per child,
 instead of the currently seen near 80M.
 
 Simply 'shortening' the possible evaluation of the expression by
 replacing '..*' by .{1,N} (with 'N' a 'reasonably short' number)
 shrunk the problem to manageable sizes!
 
 Since then I never again used .+ or .* but ALWAYS limit the length.

Yep, that's correct.  It's important to *never* use .* or .+
in rules.

- --j.
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Exmh CVS

iD8DBQFDANKHMJF5cimLx9ARAk9oAKCyVSB0u2mMVnnvJlyogesHtzZ7nACfWgIz
/bqCgRYmrlCX2J9cdUazBxg=
=qkt+
-END PGP SIGNATURE-



Re: Bonded Sender

2005-08-15 Thread Greg Allen
First thing I do whenever do an upgrade of SA is to go through and zero
out any rules that suppose someone is a good player. I don't believe in
someone being able to pay to send my system spam. Any such whitelist
systems will eventually be abused IMO. Spammers look at SA rules and take
the easiest target to bypass SA. The easiest target is the large negative
rules. I chose not to provide them any backdoor advantage at all. In other
words, I don't give any large negative points away that could be targeted,
I zero those large negatives out or make them minimal in value.



 We're moving away from our current antispam setup which uses the bonded
 sender list.  In doing some checking to see how I want to setup SA, I
 noticed that currently many messages that look like spam are being
 whitelisted by our current setup because of the bonded sender list.

 What is the basic feeling of Bonded Sender in the SA world?  I'm
 thinking it's crap...

 Thanks,
 Russ
 ---
 [This E-mail scanned for viruses by Declude Virus]





Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread Theo Van Dinter
On Mon, Aug 15, 2005 at 10:33:51AM -0700, Justin Mason wrote:
 None of the 2.6x series support 5.005.  I don't think the 2.5x series
 did, either.  5.005 is *very* outdated by now

Actually, 2.6x does support 5.005, at least according to our documentation:

The SpamAssassin 2.6x release series will be the last set of releases
to officially support perl versions earlier than perl 5.6.0.

We changed this with 3.0 to require 5.6.1.

-- 
Randomly Generated Tagline:
Um, hi, Bart. I know you from school.
 
--Ralph Wiggum
  Bart Sells His Soul (Episode 3F02)


pgpkrGelXXDbc.pgp
Description: PGP signature


Where should I adjust scoring

2005-08-15 Thread Sloan, Craig
I've inherited a SA ver 3.0.1 box that is running great (thus my lack of
intimacy with it). I would like to adjust some of the scoring, and I
want to make sure that I change it in the correct location. I've seen a
couple of locations suggested and I not sure which would be preferred
and/or better.

The spamd daemon is running under the user 'spamfilter'. Should I adjust
it in /home/spamfilter/.spamassassin/user_prefs or in
/etc/mail/spamassassin/local.cf?

Thanks,
Craig Sloan



Re: Where should I adjust scoring

2005-08-15 Thread Bob McClure Jr
On Mon, Aug 15, 2005 at 03:03:11PM -0400, Sloan, Craig wrote:
 I've inherited a SA ver 3.0.1 box that is running great (thus my lack of
 intimacy with it). I would like to adjust some of the scoring, and I
 want to make sure that I change it in the correct location. I've seen a
 couple of locations suggested and I not sure which would be preferred
 and/or better.
 
 The spamd daemon is running under the user 'spamfilter'. Should I adjust
 it in /home/spamfilter/.spamassassin/user_prefs or in
 /etc/mail/spamassassin/local.cf?

The latter.  SA won't read /home/spamfilter/.spamassassin/user_prefs
unless it's processing email for spamfilter, and it's being called
from something like ~spamfilter/.procmailrc.

 Thanks,
 Craig Sloan

Also, you should upgrade to v3.0.4.  Versions 3.0.1-3 have a DOS
vulnerability.

Cheers,
-- 
Bob McClure, Jr. Bobcat Open Systems, Inc.
[EMAIL PROTECTED]  http://www.bobcatos.com
God doesn't have (or need) a Plan B.


test for multipart/alternative discrepancies?

2005-08-15 Thread Mike Jackson
I've been getting quite a few spams (which slipped past SA) in the last few 
minutes with subject lines like dies in McDonalds, so I looked at the 
message source to see how they were scoring (which I've included below). In 
all the cases, the HTML content (at least as displayed in Outlook Express) 
was fairly consistent, but the plain text version looked like typical Bayes 
poisoning text.


Would it be possible to craft a rule that roughly compares the text/plain 
and HTML-stripped text/html versions of a message and scored against them if 
the words they contained were significantly different? Or is that 
technically infeasible?





 Content-Type: text/plain;


Hello,
5.  Kislovodsk:  Literally  `acid  waters,  a  popular resort  in  t=
he =
   `Thats wonderful! Koroviev  yelled. Somewhat stunned by his  =
chatter,that  one  could execute  such  a man.  There  had  been  no  =
execution!  Nocloser, youll see the details.midnight moon. A greenish =
kerchief of  night-light fell from the window-sillup still more ... She =
greedily began gulping down caviar.up to the footboard of an A tram =
waiting at a stop, brazenly elbow aside a Here he applauded, but =
quite  alone, while a confident smile  played onthat might occur at the =
time of the execution in the city of Yershalaim,  sospeaking, I had =
nothing more to do, and I lived from one meeting with her toPetrakovs. =
Placing his bulging briefcase on the table, Boba  immediately =
putposts?[6]horizon. He did not rejoice in the staggeringly beautiful =
view  which openedpaying or free, but even changes countenance at any =
theatrical conversation.what she was going to tell the neighbours the =
next day.phrase:

#

 Content-Type: text/html;

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN
HTMLHEAD
META http-equiv=3DContent-Type content=3Dtext/html; charset=3Dus-ascii
META content=3DMSHTML 6.00.2800.1106 name=3DGENERATOR
STYLE/STYLE
/HEAD
BODY bgColor=3D#ff
DIVFONT face=3DArial/FONTnbsp;/DIV
DIVFONT face=3DArialA court has sentenced a man to life in jail for the 
=

=

bombing of a McDonald's restaurant, which left three people =
dead./FONT/DIV
DIVFONT face=3DArial/FONTnbsp;/DIV
DIVFONT face=3DArialThe man, Agung Abdul Hamid, was found guilty of =
financing
and co-ordinating the attack./FONT/DIV
DIVFONT face=3DArial/FONTnbsp;/DIV
DIVFONT face=3DArialA href=3Dhttp://www.ildhd.lastrez.com;Read full =
=
story./A/FONT/DIV
DIVnbsp;/DIV/BODY/HTML



Re: test for multipart/alternative discrepancies?

2005-08-15 Thread Theo Van Dinter
On Mon, Aug 15, 2005 at 12:33:31PM -0700, Mike Jackson wrote:
 Would it be possible to craft a rule that roughly compares the text/plain 
 and HTML-stripped text/html versions of a message and scored against them 
 if the words they contained were significantly different? Or is that 
 technically infeasible?

You mean MPART_ALT_DIFF ?  ;)

-- 
Randomly Generated Tagline:
You're not significant until someone complains about you publically.
 - Theo Van Dinter


pgpzye2ytBV4k.pgp
Description: PGP signature


Re: test for multipart/alternative discrepancies?

2005-08-15 Thread Matt Kettler
Mike Jackson wrote:
 I've been getting quite a few spams (which slipped past SA) in the last
 few minutes with subject lines like dies in McDonalds, so I looked at
 the message source to see how they were scoring (which I've included
 below). In all the cases, the HTML content (at least as displayed in
 Outlook Express) was fairly consistent, but the plain text version
 looked like typical Bayes poisoning text.
 

Really, I'd be looking into why the messages got past SA. Did it get a decent
BAYES_ score? The bayes poison really shouldn't be a problem.

The use of chi-squared combining makes bayes poisoning pretty ineffective as
long as you're training your bayes often and training well.

And by training well I specifically mean you must train spam messages
containing poison as spam. If you're avoiding training poison, then you
yourself are making that poison effective.

(Bayes can only be as accurate as its training. If its not getting realistic
training, it won't do well with realistic mail.)


RE: test for multipart/alternative discrepancies?

2005-08-15 Thread Herb Martin
 -Original Message-
 From: Mike Jackson [mailto:[EMAIL PROTECTED] 
 Sent: Monday, August 15, 2005 2:34 PM
 To: users@spamassassin.apache.org
 Subject: test for multipart/alternative discrepancies?
 
 I've been getting quite a few spams (which slipped past SA) 
 in the last few minutes with subject lines like dies in 
 McDonalds, so I looked at the message source to see how they 
 were scoring (which I've included below). In all the cases, 
 the HTML content (at least as displayed in Outlook Express) 
 was fairly consistent, but the plain text version looked like 
 typical Bayes poisoning text.
 
 Would it be possible to craft a rule that roughly compares 
 the text/plain and HTML-stripped text/html versions of a 
 message and scored against them if the words they contained 
 were significantly different? Or is that technically infeasible?

Found one in my trap -- SpamAssasssin (3.10rc1) with lots of SARE
and many network tests scored it:  29.2

Bayes only scored it at 50% which was good for only 0.7 points.

Content analysis details: (29.2 points, 6.0 required)
 pts rule name description
  --
--
 1.1 SPF_FAIL SPF: sender does not match SPF record (fail)
 [SPF failed: Please see
http://spf.pobox.com/why.html?sender=tiffiny%40karta.com%3E%0Atiffiny%40kart
a.comip=58.51.205.72receiver=www.LearnQuick.Com]
 3.5 SPF_HELO_FAIL SPF: HELO does not match SPF record (fail)
 [SPF failed: Please see
http://spf.pobox.com/why.html?sender=karta.comip=58.51.205.72receiver=www.
LearnQuick.Com]
 0.7 MPART_ALT_DIFF_COUNT BODY: HTML and text parts are different
 1.0 HTML_MESSAGE BODY: HTML included in message
 0.9 BAYES_50 BODY: Bayesian spam probability is 40 to 60%
 [score: 0.4999]
 0.7 Y_SILLY_SALUTATION RAW: Foobar,+ salutation
 1.5 RAZOR2_CF_RANGE_E8_51_100 Razor2 gives engine 8 confidence level
 above 50%
 [cf: 100]
 0.5 RAZOR2_CHECK Listed in Razor2 (http://razor.sf.net/)
 1.5 RAZOR2_CF_RANGE_E4_51_100 Razor2 gives engine 4 confidence level
 above 50%
 [cf: 100]
 2.0 RAZOR2_CF_RANGE_51_100 Razor2 gives confidence level above 50%
 [cf: 100]
 3.7 PYZOR_CHECK Listed in Pyzor (http://pyzor.sf.net/)
 1.5 NO_DNS_FOR_FROM DNS: Envelope sender has no MX or A DNS records
 1.6 URIBL_SBL Contains an URL listed in the SBL blocklist
 [URIs: lastrez.com]
 2.5 URIBL_BLACK Contains an URL listed in the URIBL blacklist
 [URIs: lastrez.com]
 4.5 URIBL_SC2_SURBL Has URI in SC2 at http://www.surbl.org/lists.html
 [URIs: lastrez.com]
 1.0 DIGEST_MULTIPLE Message hits more than one network digest check
 0.9 FM_NO_STYLE FM_NO_STYLE

Subject: * SPAM *_29.2 McDonÂld's bomber jailed

--
Herb Martin



Re: AW: filter for subjects

2005-08-15 Thread mouss

Fettke, Dirk a écrit :


Sorry, I forgot.

I'm using postfix, amavisd, spamassassin

In my local.cf I have insert these lines like Matt Kettler told me.

header BANNED_SUB1  Subject =~ /viagra/i
score BANNED_SUB1   100

Unfortunately it doesn't work. When i write myself an email from gmx with the subject 
viagra it will be delivered to my mailbox.

 

if you really want to reject such messages, then do it with postfix 
header_checks. this way, the message doesn't even get queued, and if the 
sender is legit, he will know.


Re: FYI: ccTLD .de listed in RFC-ignorant.org

2005-08-15 Thread Matt Kettler
Dirk Bonengel wrote:
 FYI:
 rfc-ignorant.org has .de listed in whois.rfc-ignorant.com.

As others pointed out, it's listed 127.0.0.7 not .5.

 
 http://www.rfc-ignorant.org/tools/detail.php?domain=desubmitted=1120996396table=whois
 
 In a standard 3.0.x install, DNS_FROM_RFC_WHOIS gives a score of 0.492
 (net) or 0.296 (net+bayes).



However, all that said, 0.492 is a pretty small score for a rule. And that low
score is a reflection of RFCI's occasional FP problems and low general hit rate
on spam.

Even if the rule was hitting all of .de, it really isn't that significant of a
score. (Unless you're talking about nutjobs with spam thresholds set at 1.0).

With such a low score, I really wouldn't worry much even if it was hitting.
Unless you're dealing with nutjobs that have spam thresholds set at 1.0 it
really isn't very significant.

Now 3.1.0-pre1 has a higher score for it. (1.45 in set3). That I might worry a
bit if it was false hitting.












First 3.1 observation

2005-08-15 Thread Steve Martin
The first thing I've noticed after running 3.1pre1 for a few days is  
that I'm getting much less bayes auto learning of ham due to the fact  
that most of my messages from mailings lists fail SPF tests and get  
penalized 2.4-2.6 points or so for it.  They still aren't marked as  
spam, but with higher scores than before.


Seems like we should have a way to disable SPF tests for mailing  
lists since SPF is known not to work for them.



--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



Re: Spam-Status tag with score numbers?

2005-08-15 Thread mouss

Simon Oosthoek a écrit :



I tried this, but nothing changed, certainly not the tests with scores 
as I'd like to have. I'm starting to think your suggestion #3 is maybe 
the culprit, I'll see if I can ask the question on the amavis list...


amavisd regenerates SA headers. you need to patch amavisd.
In amavisd (/usr/*/sbin/amavisd), look for X-Spam-Score and uncomment 
the corresponding statement.
(you may want to do the same with the checker version header to enable 
SA version header).

Restart amavisd. you should then get headers like this:
   X-Spam-Status: Yes, hits=5.7 required=5 
tests=[MSGID_OUTLOOK_INVALID=2.7, URIBL_BLACK=3]

   X-Spam-Score: 5.7
   X-Spam-Level: *
   X-Spam-Flag: YES





Re: First 3.1 observation

2005-08-15 Thread Matt Kettler
Steve Martin wrote:
 The first thing I've noticed after running 3.1pre1 for a few days is 
 that I'm getting much less bayes auto learning of ham due to the fact 
 that most of my messages from mailings lists fail SPF tests and get 
 penalized 2.4-2.6 points or so for it.  They still aren't marked as 
 spam, but with higher scores than before.
 
 Seems like we should have a way to disable SPF tests for mailing  lists
 since SPF is known not to work for them.

Why? it should work perfectly for this message.

SPF should be looking at the Return-Path header, not the From: header.


Re: FYI: ccTLD .de listed in RFC-ignorant.org

2005-08-15 Thread List Mail User
...
Dirk Bonengel wrote:
 FYI:
 rfc-ignorant.org has .de listed in whois.rfc-ignorant.com.

As others pointed out, it's listed 127.0.0.7 not .5.

 
 http://www.rfc-ignorant.org/tools/detail.php?domain=desubmitted=1120996396table=whois
 
 In a standard 3.0.x install, DNS_FROM_RFC_WHOIS gives a score of 0.492
 (net) or 0.296 (net+bayes).



However, all that said, 0.492 is a pretty small score for a rule. And that low
score is a reflection of RFCI's occasional FP problems and low general hit rate
on spam.

Even if the rule was hitting all of .de, it really isn't that significant of a
score. (Unless you're talking about nutjobs with spam thresholds set at 1.0).

With such a low score, I really wouldn't worry much even if it was hitting.
Unless you're dealing with nutjobs that have spam thresholds set at 1.0 it
really isn't very significant.

Now 3.1.0-pre1 has a higher score for it. (1.45 in set3). That I might worry a
bit if it was false hitting.

As usual, Matt has correctly stated the situation;  But another
way to view it is that RFCI is *not* a spam list (or lists) - It is a
group of lists of domains which violate particular RFCs.  It just happens
that spammers (and many large companies) are in this group who choose to
ignore the RFCs and of those who get reported (re. nominated) and listed,
spammers are a disproportionate group - which makes the rfci lists a good
spam sign (not even close to spam lists like the SURBLs, but more than
good enough to warrant the scores which have been computed).  When viewed
as RFCI sees themselves, FPs are quite low, simply many non-spammers through
either ignorance, choice or both, decided to or forget to abide by the
rules RFCI checks for (much better stated on their web site than by me).

All the above said, they work even better as URI_ rules, though
they are not used that way in any SpamAssassin distribution by default.
And then, they hit a *much* larger proportion of spam and a somewhat
lower proportion os ham - i.e. the S/O ration is better when used as
URI_ rules than when used as DNS_FROM_ or RCVD_ rules (i.e. spammers
often use open relays or proxies, which avoids many rfci lists, but the
web sites themselves are still listed at rfci as non-compliant domains).


Paul Shupak
[EMAIL PROTECTED]


Re: First 3.1 observation

2005-08-15 Thread List Mail User
...
The first thing I've noticed after running 3.1pre1 for a few days is  
that I'm getting much less bayes auto learning of ham due to the fact  
that most of my messages from mailings lists fail SPF tests and get  
penalized 2.4-2.6 points or so for it.  They still aren't marked as  
spam, but with higher scores than before.

Seems like we should have a way to disable SPF tests for mailing  
lists since SPF is known not to work for them.


--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html


It must be the mailing lists you subscribe to (or some exploder
or forwarder).  I find most lists, like this one, pass SPF checks.

Paul Shupak
[EMAIL PROTECTED]


Re: test for multipart/alternative discrepancies?

2005-08-15 Thread Mike Jackson

On Mon, Aug 15, 2005 at 12:33:31PM -0700, Mike Jackson wrote:
 Would it be possible to craft a rule that roughly compares the 
 text/plain
 and HTML-stripped text/html versions of a message and scored against 
 them

 if the words they contained were significantly different? Or is that
 technically infeasible?

You mean MPART_ALT_DIFF ?  ;)


Well blow me down  :) Strange that I didn't see that rule hit on this 
message though.




Re: First 3.1 observation

2005-08-15 Thread Steve Martin

Well, it doesn't ;-)

On Aug 15, 2005, at 6:02 PM, Matt Kettler wrote:


Return-Path: [EMAIL PROTECTED]
X-Original-To: [EMAIL PROTECTED]
Delivered-To: [EMAIL PROTECTED]
Received: by cheezmo.com (Postfix, from userid 88)
id 30552EBDC5; Mon, 15 Aug 2005 18:03:32 -0500 (CDT)
X-Spam-Flag: NO
X-Spam-Checker-Version: SpamAssassin 3.1.0-rc1 (2005-08-11) on  
closet.local

X-Spam-Level:
X-Spam-Hammy: Tokens not available.
X-Spam-Status: No, score=-5.2 required=5.0 tests=AWL,FORGED_RCVD_HELO,
SPF_HELO_SOFTFAIL,USER_IN_WHITELIST_TO autolearn=no  
version=3.1.0-rc1

X-Spam-Spammy: Tokens not available.
X-Spam-Tokens: Bayes not run.
X-Spam-Report:
*  0.1 FORGED_RCVD_HELO Received: contains a forged HELO
* -6.0 USER_IN_WHITELIST_TO User is listed in 'whitelist_to'
*  2.4 SPF_HELO_SOFTFAIL SPF: HELO does not match SPF record  
(softfail)

*  [SPF failed: ]
* -1.8 AWL AWL: From: address is in the auto white-list
Received: from xanadu.evi-inc.com (xan.evitechnology.com  
[208.39.141.86])

by cheezmo.com (Postfix) with ESMTP id 816AAEBDBA
for [EMAIL PROTECTED]; Mon, 15 Aug 2005 18:02:57 -0500  
(CDT)

Received: from [10.0.6.1] (EVI802-275.evitechnology.com [10.0.6.1])
(authenticated bits=0)
by xanadu.evi-inc.com (8.12.8/8.12.8) with ESMTP id  
j7FN27bt005517;

Mon, 15 Aug 2005 19:02:07 -0400
Message-ID: [EMAIL PROTECTED]
Date: Mon, 15 Aug 2005 19:02:06 -0400
From: Matt Kettler [EMAIL PROTECTED]
User-Agent: Mozilla Thunderbird 1.0.6 (Windows/20050716)
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Steve Martin [EMAIL PROTECTED]
Cc: users@spamassassin.apache.org
Subject: Re: First 3.1 observation
References: [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
X-Enigmail-Version: 0.92.0.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Greylist: Sender succeded SMTP AUTH authentication, not delayed  
by milter-greylist-2.0b2 (xanadu.evi-inc.com [192.168.50.2]); Mon,  
15 Aug 2005 19:02:07 -0400 (EDT)
X-EVI-MailScanner-Information: Please contact the EVI IT dept for  
more information

X-EVI-MailScanner: Found to be clean
X-EVI-MailScanner-SpamCheck: not spam, SpamAssassin (score=-3.001,
required 5, BAYES_00 -3.00, INFO_GREYLIST_NOTDELAYED -0.00)
X-MailScanner-From: [EMAIL PROTECTED]
Status:



--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



Re: ANNOUNCE: SpamAssassin 3.1.0-rc1 release candidate available!

2005-08-15 Thread jdow

From: Kenneth Porter [EMAIL PROTECTED]

--On Saturday, August 13, 2005 6:58 PM -0400 Theo Van Dinter 
[EMAIL PROTECTED] wrote:



On Sat, Aug 13, 2005 at 03:07:14PM +0530, Ramprasad A Padmanabhan wrote:

When I build the rpm from the spec file ( on fedora core 3 ) the
spamassassin-tools rpm is not created. Was it not a part of SA.


The tools RPM was deprecated.  There was very little in there that wasn't
development related, which is better taken out of SVN or the tarball,
so ...


I'd recommend adding an Obsoletes tag for the deprecated subpackage, then. 
Otherwise the 3.0.4 subpackage gets orphaned and blocks updating of the 
surviving subpackages.


What sub-packages that a CPAN style update won't catch?

{^_^} 





Re: First 3.1 observation

2005-08-15 Thread Steve Martin

Not for me...

* -6.0 USER_IN_WHITELIST_TO User is listed in 'whitelist_to' *  2.4  
SPF_HELO_SOFTFAIL SPF: HELO does not match SPF record (softfail)  
*  [SPF failed: ] * -1.3 AWL AWL: From: address is in the auto  
white-list


That is from your message...

On Aug 15, 2005, at 6:17 PM, List Mail User wrote:


...
The first thing I've noticed after running 3.1pre1 for a few days is
that I'm getting much less bayes auto learning of ham due to the fact
that most of my messages from mailings lists fail SPF tests and get
penalized 2.4-2.6 points or so for it.  They still aren't marked as
spam, but with higher scores than before.

Seems like we should have a way to disable SPF tests for mailing
lists since SPF is known not to work for them.


--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html




It must be the mailing lists you subscribe to (or some exploder
or forwarder).  I find most lists, like this one, pass SPF checks.

Paul Shupak
[EMAIL PROTECTED]



--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



Re: Bonded Sender

2005-08-15 Thread jdow

From: Matt Kettler [EMAIL PROTECTED]

Most people having problems with BSP are in category 2, or consider 
subscriber mail to be spam. (There is a lot of spam-ish subscriber mail 
out there, my users subscribe to lots of it, on purpose, it's often hard 
for me to tell without asking the recipient. I also have users that claim 
that amazon mail is spam, even though they bought items there and didn't 
clear the send me special offers check box.)


At this precise moment I am somewhat hot under the collar with
subscriber sort of mail from American Honda Motors Acura people. I
have been unable to disabuse them about my name being Greg (hardly),
my being an Acura owner, my ever intending to own an Acura, and my
future intentions for a visit to nearby Torrance with a load of
fertilizer for their facility. They have no unsubscribe. They have
no other way to get off their infernal Owner list. So I rather
nominate them for a black hole listing infinitely deep. I wish they
had a bond so I could nail them for some money. After 5 years of it
I am getting irritated even seeing it in my spam bin.

{^_^} 





Re: How do I ask instration problem of Mail-SpamAssassin ?

2005-08-15 Thread jdow

How does one search there for a specific version? Just for grins and
giggles I tried to search for Mail::SpamAssassin-3.10 and received
8140 results listings. It should either return the RC1 or it should
return nothing. Ah well. It's another Bugzilla search.

Nor do I think 2.64 will run with 5.005. I do not know what version,
if any, will. Nor can I live with myself if I help someone setup a
version that old. Do nothing destructive (except to spammers.)

{^_^}
- Original Message - 
From: Stuart Johnston [EMAIL PROTECTED]



As stated at the bottom of the downloads page on spamassassin.org, Older, 
Perl-licensed versions can be found via CPAN.


http://search.cpan.org/CPAN/authors/id/J/JM/JMASON/Mail-SpamAssassin-2.64.tar.gz





Re: Very long scan times - Finding the culprit rule

2005-08-15 Thread jdow

From: Herb Martin [EMAIL PROTECTED]

-Original Message-
From: Paul J. Smith [mailto:[EMAIL PROTECTED] 
DNS is working fine.  We've been running SA for 6 months no 
problem, it's only when we added the extra 10 rule sets it 
got bogged down.  I've just been removing them one by one at 
the moment and have got the timing back down to 6 secs or so, 
but it would be very handy to have the actual times of each 
test logged so I can see which are the slow ones.


You didn't by any chance add one of the very large blacklist
rule sets did you?

I have two of these that run 1-2 MB and cannot run them without
SpamD getting unreliable (slow, sluggish etc.)


Jettison those rule sets and run the SURBL tests instead. The SURBL
tests are MUCH more up to date.
{^_^}



Re: First 3.1 observation

2005-08-15 Thread Steve Martin
Looks like I was having a DNS problem.  Not sure why it would turn  
into SPF_FAIL's, though since I think it would fail to get the SPF  
record and at that point shouldn't it not run SPF rules?



I reran some of the messages that had been failing and they are fine  
now.


On Aug 15, 2005, at 6:17 PM, List Mail User wrote:


...
The first thing I've noticed after running 3.1pre1 for a few days is
that I'm getting much less bayes auto learning of ham due to the fact
that most of my messages from mailings lists fail SPF tests and get
penalized 2.4-2.6 points or so for it.  They still aren't marked as
spam, but with higher scores than before.

Seems like we should have a way to disable SPF tests for mailing
lists since SPF is known not to work for them.


--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html




It must be the mailing lists you subscribe to (or some exploder
or forwarder).  I find most lists, like this one, pass SPF checks.

Paul Shupak
[EMAIL PROTECTED]



--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



Re: How to use Multilog ?

2005-08-15 Thread George Georgalis
On Mon, Aug 15, 2005 at 06:45:52PM +0700, Dhanny Kosasih wrote:
I use SpamAssassin 3.0.4 with FC3, and i use script from Fedora to start 
and stop spamd. But qmailmrtg7 can't read log with standard 
SpamAssassin. The qmailmrtg can only read log with multilog format. I 
try --syslog=stderr but i don't know where is the log file ? How 
SpamAssassin use multilog format ?

$ head -n3 /service/spamd/run /service/spamd/log/run 
== /service/spamd/run ==
#!/bin/sh
MAX=3
exec spamd -i -A 127.0.0.0/8,10.0.0.0/8,192.168.0.0/16 -m ${MAX} 
--username=qmaild --syslog=stderr 21

== /service/spamd/log/run ==
#!/bin/sh
exec setuidgid log multilog t /var/log/spamd


-- 
George Georgalis, systems architect, administrator IXOYE
http://galis.org/ cell:646-331-2027 mailto:[EMAIL PROTECTED]


Re: First 3.1 observation

2005-08-15 Thread List Mail User
...
Not for me...

* -6.0 USER_IN_WHITELIST_TO User is listed in 'whitelist_to' *  2.4  
SPF_HELO_SOFTFAIL SPF: HELO does not match SPF record (softfail)  
*  [SPF failed: ] * -1.3 AWL AWL: From: address is in the auto  
white-list

That is from your message...

On Aug 15, 2005, at 6:17 PM, List Mail User wrote:

 ...
 The first thing I've noticed after running 3.1pre1 for a few days is
 that I'm getting much less bayes auto learning of ham due to the fact
 that most of my messages from mailings lists fail SPF tests and get
 penalized 2.4-2.6 points or so for it.  They still aren't marked as
 spam, but with higher scores than before.

 Seems like we should have a way to disable SPF tests for mailing
 lists since SPF is known not to work for them.


 --
 Steve Martin  http://www.cheezmo.com/
 Smart Calibration, LLC   http://www.smartcalibration.com/
 The Widescreen Movie Centerhttp://www.widemovies.com/
 Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



 It must be the mailing lists you subscribe to (or some exploder
 or forwarder).  I find most lists, like this one, pass SPF checks.

 Paul Shupak
 [EMAIL PROTECTED]


--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html


I get SPF_PASS;  Do you have any internal forwarding happening
that might be upseting the trusted path?  Also, maybe an exploder (for
multiple recipients at your site) or a forwarder (generally breaks SPF,
one of the *real* problems with it).  I do forward internally, but check
SPF in the first machine in the chain, so all the lists I subscribe to
(quite a large number - hence List Mail User), give either SPF_PASS,
both SPF_PASS and SPF_HELO_PASS, I can't find any of dezens that give
a FAILURE.  But I only run 3.1 for testing and am using 3.0.4 for the
production machines, so there might be a bug

Can you give an example of headers (recipient can be munged away)
and the SPF record (i.e. for this list I see:
% dig spamassassin.apache.org any @ns1.us.bitnames.com
...
spamassassin.apache.org. 1800   IN  TXT v=spf1 mx -all
...

and

% dig spamassassin.apache.org mx @ns1.us.bitnames.com
...
spamassassin.apache.org. 1800   IN  MX  10 asf.osuosl.org.
spamassassin.apache.org. 1800   IN  MX  20 mail.apache.org.
...

and the mail is indeed delivered from hermes.apache.org[209.237.227.199]

% host 209.237.227.199
199.227.237.209.in-addr.arpa domain name pointer hermes.apache.org.

% host mail.apache.org
mail.apache.org has address 209.237.227.199

So everything matches.  Possibly I haven't played enough with real
mail and 3.1 to see the problem - it appears that the double-lookup is
required to get the answer correct (again a reason for a possible code bug).
Simple matching of rDNS will give the wrong result and I haven't looked at
the SPF code, ever.  With the given SPF record the 'MX' RRs must be fetched
and the mapped to IPs and the resilts checked (because of aliasing - real in
this case and always possible - i.e. name - IP is many to one, but IP - name
is only one to one).

Also, for the list I don't get any SPF_HELO_xxx, for some lists
I do.

Paul Shupak
[EMAIL PROTECTED]


Re: First 3.1 observation

2005-08-15 Thread Steve Martin
I replied elsewhere, but I was having some strange DNS problems today  
that probably caused every other lookup to fail.  I THINK that was  
what was causing it.  I'll watch for a while...


On Aug 15, 2005, at 8:12 PM, List Mail User wrote:


...
Not for me...

* -6.0 USER_IN_WHITELIST_TO User is listed in 'whitelist_to' *  2.4
SPF_HELO_SOFTFAIL SPF: HELO does not match SPF record (softfail)
*  [SPF failed: ] * -1.3 AWL AWL: From: address is in the auto
white-list

That is from your message...

On Aug 15, 2005, at 6:17 PM, List Mail User wrote:



...
The first thing I've noticed after running 3.1pre1 for a few  
days is
that I'm getting much less bayes auto learning of ham due to the  
fact

that most of my messages from mailings lists fail SPF tests and get
penalized 2.4-2.6 points or so for it.  They still aren't marked as
spam, but with higher scores than before.

Seems like we should have a way to disable SPF tests for mailing
lists since SPF is known not to work for them.


--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html





It must be the mailing lists you subscribe to (or some exploder
or forwarder).  I find most lists, like this one, pass SPF checks.

Paul Shupak
[EMAIL PROTECTED]




--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html




I get SPF_PASS;  Do you have any internal forwarding happening
that might be upseting the trusted path?  Also, maybe an exploder  
(for
multiple recipients at your site) or a forwarder (generally breaks  
SPF,
one of the *real* problems with it).  I do forward internally, but  
check

SPF in the first machine in the chain, so all the lists I subscribe to
(quite a large number - hence List Mail User), give either SPF_PASS,
both SPF_PASS and SPF_HELO_PASS, I can't find any of dezens that give
a FAILURE.  But I only run 3.1 for testing and am using 3.0.4 for the
production machines, so there might be a bug

Can you give an example of headers (recipient can be munged away)
and the SPF record (i.e. for this list I see:
% dig spamassassin.apache.org any @ns1.us.bitnames.com
...
spamassassin.apache.org. 1800   IN  TXT v=spf1 mx -all
...

and

% dig spamassassin.apache.org mx @ns1.us.bitnames.com
...
spamassassin.apache.org. 1800   IN  MX  10 asf.osuosl.org.
spamassassin.apache.org. 1800   IN  MX  20 mail.apache.org.
...

and the mail is indeed delivered from hermes.apache.org 
[209.237.227.199]


% host 209.237.227.199
199.227.237.209.in-addr.arpa domain name pointer hermes.apache.org.

% host mail.apache.org
mail.apache.org has address 209.237.227.199

So everything matches.  Possibly I haven't played enough with  
real
mail and 3.1 to see the problem - it appears that the double- 
lookup is
required to get the answer correct (again a reason for a possible  
code bug).
Simple matching of rDNS will give the wrong result and I haven't  
looked at
the SPF code, ever.  With the given SPF record the 'MX' RRs must be  
fetched
and the mapped to IPs and the resilts checked (because of aliasing  
- real in
this case and always possible - i.e. name - IP is many to one, but  
IP - name

is only one to one).

Also, for the list I don't get any SPF_HELO_xxx, for some lists
I do.

Paul Shupak
[EMAIL PROTECTED]



--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



Changed maillog entries in 3.1.0-rc1? #2

2005-08-15 Thread Ed Kasky
To add to my last post on this subject, I discovered that the sa-stas.pl 
that ships with SA is also coming up with zeroes.  I made the following 
change to the script:


 next parseloop unless ($sl-{'program'} eq 'spamd');
to
 next parseloop unless ($sl-{'program'} eq 'spamassassin');

and it works.

Should I report this as a bug or was it an intentional change?

Ed

. . . . . . . . . . . . . . . . . .
Randomly Generated Quote (445 of 997):
Doing gets it done.



Re: Couldn't find a good delta atime

2005-08-15 Thread Loren Wilton
 Hello,

 When I run the sa-learn --force-expire on a regular basis, I eventually
 run into this:

 debug: bayes: couldn't find a good delta atime, need more token
 difference, skipping expire.

 Once it is in this state, I never can recover and have to zap the
database.

 What could cause this?

Which version of SA?  I thought (though I may be mistaken) that Theo put
some patches in to make this not happen, or be less likely, in recent
versions.

Loren



Re: test for multipart/alternative discrepancies?

2005-08-15 Thread Loren Wilton
 Would it be possible to craft a rule that roughly compares the text/plain
 and HTML-stripped text/html versions of a message and scored against them
if
 the words they contained were significantly different? Or is that
 technically infeasible?

I just want a rule that checks the text/plain part for zero uris and the
html part for  0 uris.  That would catch 99+% of this trash without trying
very hard.

Loren



Re: Changed maillog entries in 3.1.0-rc1? #2

2005-08-15 Thread Loren Wilton
If it doesn't work you should enter a bz ticket, whether it was intentional
or not!  :-)

Loren

 Should I report this as a bug or was it an intentional change?




Re: ANNOUNCE: SpamAssassin 3.1.0-rc1 release candidate available!

2005-08-15 Thread email builder

Exellent.  This is the information I needed!  Is there any chance of getting
an updated release schedule (I checked the wiki, but the schedule info for
3.1.0 seems out of date)?

Might also be nice to see some pointers in the docs about how to reenable the
DCC and Razor plugins for those of us who will continue to use those tools. 
Is having use_dcc and use_razor2 in our local.cf set to one (instead of
relying on the default which has now changed) what you mean by trivial?

Thanks!


 - - added PostgreSQL, MySQL 4.1+, and local SDBM file Bayes storage
 modules. SQL
   storage is now recommended for Bayes, instead of DB_File. NDBM_File
 support
   has been dropped due to a major bug in that module.
 
 
 
 What's the difference between the MySQL support that already existed in
 prior
 versions?  Is there anything those of us who already have our bayes data
 in
 MySQL should do differently as of 3.1.0?
 
   
 
 
 The previous SQL support (Mail::SpamAssassin::BayesStore::SQL) was very
 generic, usable by multiple database drivers.  With 3.1.0 we broke out
 the support and now include 2 very specific SQL backends (MySQL 4.1+ and
 PostgreSQL) in addition to the more generic backend.  These specific
 backends make use of non-standard SQL features to get a speed boost.
 
 That said, if you were previously using SQL support with a MySQL
 database then you should be able to simply switch to using
 Mail::SpamAssassin::BayesStore::MySQL and get an instant speedup,
 assuming you already have MySQL 4.1+ installed.  We do suggest that you
 switch your tables to InnoDB type tables (not currently the default) to
 get better data integrity (with the support of transactions).
 
 If you were using PostgreSQL with the previous support, you should
 switch (we're talking about a 7x - 27x improvement) ASAP, which might
 involve a complete wipe and rebuild of your database.  Although, I would
 try an sa-learn --backup and sa-learn --restore before I completely gave
 up on the data.
 
 If you are interested in how well the various backends perform, compared
 to the others, see
 http://wiki.apache.org/spamassassin/BayesBenchmarkResults
 It is very hard to compare to previous versions, due to changes in other
 factors such as rules and message parsing code, but the improvments in
 3.1 represent anywhere from a 2x - 27x improvements in previous
 performance.




__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


bayes expiration problems?

2005-08-15 Thread Steve Martin
I noticed an email took over 300 seconds to process, and the reason  
was apparently opportunistic bayes expiry taking to long as it ended  
up aborting processing.


So, I tried sa-learn --force-expire -D and saw this in the output...

[1364] dbg: bayes: token count: 156544, final goal reduction size: 44044
[1364] dbg: bayes: first pass?  current: 1124161947, Last:  
1124117250, atime: 2764800, count: 2658, newdelta: 166852, ratio:  
16.5703536493604, period: 43200
[1364] dbg: bayes: can't use estimation method for expiry, unexpected  
result, calculating optimal atime delta (first pass)

[1364] dbg: bayes: expiry max exponent: 9
[1364] dbg: bayes: atime_token reduction
[1364] dbg: bayes: _===
[1364] dbg: bayes: 43200_150537
[1364] dbg: bayes: 86400_145179
[1364] dbg: bayes: 172800_135718
[1364] dbg: bayes: 345600_125439
[1364] dbg: bayes: 691200_103991
[1364] dbg: bayes: 1382400_63770
[1364] dbg: bayes: 2764800_1714
[1364] dbg: bayes: 5529600_0
[1364] dbg: bayes: 11059200_0
[1364] dbg: bayes: 22118400_0
[1364] dbg: bayes: first pass decided on 2764800 for atime delta
[1364] dbg: locker: refresh_lock: refresh /etc/mail/spamassassin/ 
bayes.lock


... lots of those...

[1364] dbg: bayes: untie-ing
[1364] dbg: bayes: untie-ing db_toks
[1364] dbg: bayes: untie-ing db_seen
[1364] dbg: bayes: files locked, now unlocking lock
[1364] dbg: locker: safe_unlock: unlink /etc/mail/spamassassin/ 
bayes.lock

expired old bayes database entries in 180 seconds
154830 entries kept, 1714 deleted
token frequency: 1-occurence tokens: 62.73%
token frequency: less than 8 occurrences: 20.42%
[1364] dbg: bayes: expiry completed


I ran it again, just to see and got this...


[1381] dbg: bayes: expiry check keep size, 0.75 * max: 112500
[1381] dbg: bayes: token count: 154830, final goal reduction size: 42330
[1381] dbg: bayes: first pass?  current: 1124162215, Last:  
1124162127, atime: 2764800, count: 1714, newdelta: 111950, ratio:  
24.6966161026838, period: 43200
[1381] dbg: bayes: can't use estimation method for expiry, unexpected  
result, calculating optimal atime delta (first pass)

[1381] dbg: bayes: expiry max exponent: 9
[1381] dbg: bayes: atime_token reduction
[1381] dbg: bayes: _===
[1381] dbg: bayes: 43200_148823
[1381] dbg: bayes: 86400_143465
[1381] dbg: bayes: 172800_134004
[1381] dbg: bayes: 345600_123725
[1381] dbg: bayes: 691200_102277
[1381] dbg: bayes: 1382400_62056
[1381] dbg: bayes: 2764800_0
[1381] dbg: bayes: 5529600_0
[1381] dbg: bayes: 11059200_0
[1381] dbg: bayes: 22118400_0
[1381] dbg: bayes: couldn't find a good delta atime, need more token  
difference, skipping expire

[1381] dbg: bayes: expiry completed
[1381] dbg: bayes: untie-ing
[1381] dbg: bayes: untie-ing db_toks
[1381] dbg: bayes: untie-ing db_seen
[1381] dbg: bayes: files locked, now unlocking lock
[1381] dbg: locker: safe_unlock: unlink /etc/mail/spamassassin/ 
bayes.lock



So, the first time it only got rid of about 2000 tokens and is stuck?


[1364] dbg: bayes: can't use estimation method for expiry, unexpected  
result, calculating optimal atime delta (first pass)


How can I figure out  what went wrong here?

[1381] dbg: bayes: couldn't find a good delta atime, need more token  
difference, skipping expire


and why did that happen on the second pass


--
Steve Martin  http://www.cheezmo.com/
Smart Calibration, LLC   http://www.smartcalibration.com/
The Widescreen Movie Centerhttp://www.widemovies.com/
Letterboxed Movie TV Schedule  http://www.widemovies.com/lbx.html



Re: bayes expiration problems?

2005-08-15 Thread Theo Van Dinter
On Mon, Aug 15, 2005 at 10:23:58PM -0500, Steve Martin wrote:
 154830 entries kept, 1714 deleted

Ok.

 [1381] dbg: bayes: token count: 154830, final goal reduction size: 42330
 [1381] dbg: bayes: 1382400_62056
 
 So, the first time it only got rid of about 2000 tokens and is stuck?

Yup.

 [1364] dbg: bayes: can't use estimation method for expiry, unexpected  
 result, calculating optimal atime delta (first pass)
 
 How can I figure out  what went wrong here?

It's in the sa-learn docs.  Basically your last expiry is too different from
what it's trying to do now, so it can't estimate new values based on the old
values.

 and why did that happen on the second pass

Per the above, expiry wants to get rid of 42330 tokens, but the first
(smallest value  0) atime difference is 62056 tokens, which means too many
would be removed, so it can't expire.

-- 
Randomly Generated Tagline:
You can't run sausage backwards through a meat grinder and end up with
 a whole pig.
 - Tim Peoples talking about the irreversability of UNIX password encoding


pgpN31S3tayaA.pgp
Description: PGP signature


Re: test for multipart/alternative discrepancies?

2005-08-15 Thread Theo Van Dinter
On Mon, Aug 15, 2005 at 07:04:36PM -0700, Loren Wilton wrote:
 I just want a rule that checks the text/plain part for zero uris and the
 html part for  0 uris.  That would catch 99+% of this trash without trying
 very hard.

FWIW, I put in a test rule for this:

OVERALL%   SPAM% HAM% S/ORANK   SCORE  NAME
  2125518255 30000.859   0.000.00  (all messages)
100.000  85.8857  14.11430.859   0.000.00  (all messages as %)
 21.938  25.5327   0.06670.997   0.000.01  T_URI_HTML_ONLY

nice. :)

-- 
Randomly Generated Tagline:
There are no threads in a.b.p.erotica,  so there's no  gain in using a
 threaded news reader.
 (Unknown source)


pgpJUHBjICJtN.pgp
Description: PGP signature


Re[2]: Very long scan times - Finding the culprit rule

2005-08-15 Thread Robert Menschel
Hello Paul,

Monday, August 15, 2005, 1:45:53 AM, you wrote:

PJS DNS is working fine.  We've been running SA for 6 months no problem,
PJS it's only when we added the extra 10 rule sets it got bogged down.  I've
PJS just been removing them one by one at the moment and have got the timing
PJS back down to 6 secs or so, but it would be very handy to have the actual
PJS times of each test logged so I can see which are the slow ones.

Don't have times for each rule, but if you'll tell us which rules
files you added, maybe we can tell you which one(s) are most
suspicious.

Bob Menschel





Re: Rules Emporium - what's been incorporated in 3.1.0?

2005-08-15 Thread Robert Menschel
Hello Brian,

Monday, August 15, 2005, 1:46:09 AM, you wrote:

BM As subject, I use a fair number of Rules Emporium rules, is there any
BM information about on which of those rules have made it into the 3.1.0 rule 
set?

As Loren suggested, my mass-check to identify any overlaps between
current SARE rules and 3.1.0 has been running for 2 days, and has
another 3-4 days to go.

When that mass-check completes, I'll review the overlaps, and we'll
modify the SARE rules to avoid any duplications that appear.

Bob Menschel





Re: First 3.1 observation

2005-08-15 Thread hamann . w

Hi,

on a well-behaved mailing list sends all mails are sent by Mr. Majordomo or 
such,
and they should work well.
Less well-behaved ones have the list server send mail as the originating user :(

I installed something on a MTA a while ago which would ask senders from a local 
domain
to authenticate even for sending to a local domain, and it turned out to trap 
Ebay messages.
So at this time Ebay was sending with the envelope from set to the originating 
user

Wolfgang Hamann


 It must be the mailing lists you subscribe to (or some exploder
 or forwarder).  I find most lists, like this one, pass SPF checks.


 Paul Shupak
 [EMAIL PROTECTED]