Re: unwanted breakthrough

2005-08-02 Thread Loren Wilton
 SARE_ADLTSUB2 Subject =~ /\b(?:blow|climax

|enlarg(e|ment)|fuck|inter+acial|lick|porn|penis|pervert|pussy|tits|tight|va
gina|virgins?)\b/i

 Fix the rule, don't ditch the \b's for such a broad rule..

 Besides, the whole rule is subject to all kinds of obfuscation tricks.
P.e.n.i.s
 still won't match, nor any other character-insertion obfuscation.

 I'd suggest creating obfu rules to detect obfuscations, and don't try to
expand
 the scope of this already over-broad rule. (which will match a few FP
cases
 as-is such as your photo enlargement is ready)

Um, I was going to point out that this rule is in the _adult set, not the
_obfu set.

Loren



Re: Qmail + spamassassin + squirellmail

2005-08-02 Thread Tom Q. Citizen

Dhanny Kosasih wrote:


Hi,
  Any body know, how to install qmail + spamassassin + squirellmail 
(can tell spam to spamassassin) ? And how to make spamassassin can 
autolearn for spam ?


Regards,
dankos.


Here are two toaster documents I used:

http://sylvestre.ledru.info/howto/howto_qmail_vpopmail.php#vpopmail
http://www.differentpla.net/node/view/165

Good luck!

Peace...

Tom


Runaway processes

2005-08-02 Thread Gordon Ross
I'm running SA 3.0.4 on OpenBSD with Perl 5.8.6  Exim V4.52.

I'm noticing that SA seems to have a big problem with child processes just 
running away, never terminating and eating CPU.

My mailservers can't cope, and I'm looking at having to switch off SA. (Not 
something I really want to do..)

No matter what I set -m to spamd, they all just go into this endless death 
spiral..

GTG

Gordon Ross,
Network Manager/Rheolwr Rhydwaith
Countryside Council for Wales/Cyngor Cefn Gwlad Cymru



Re: Load balancing spamd

2005-08-02 Thread Jason Frisvold
On 8/1/05, email builder [EMAIL PROTECTED] wrote:
 Even if I had forgotten the -A, I think I would have been seeing connection
 refused notices, but right now, it just seems to time out.  I'm pretty sure
 this is a LVS question more than a spamc/d question, since I've no problems
 with the latter -- I am only asking here to see if anyone else does SA
 weighted load balancing.

I kinda went the other way around..  I have multiple mail machines,
each with their own instance of spamd.  I use a Cisco 7206 VXR to do
the load balancing.  Works like a charm.

 Thanks!


-- 
Jason 'XenoPhage' Frisvold
[EMAIL PROTECTED]


Re: Runaway processes

2005-08-02 Thread Frank M. Cook
I've been fighting a problem which may turn out to be similar.  my 
spamassassin just starts falling behind and runaway threads could be the 
cause.  I'm going to try adjusting --max connections per child (check docs 
for exact syntax).  the default is 200.  maybe someone else will jump in 
with a recommended number but I'm thinking the default may be way too high. 
a lower number will cause each child to shut down sooner.  when the max 
number is reached the thread is stopped and a new one is created.


Frank M. Cook
Association Computer Services, Inc.
http://www.acsplus.com



RE: Runaway processes

2005-08-02 Thread Pierre Thomson
 I'm running SA 3.0.4 on OpenBSD with Perl 5.8.6  Exim V4.52.

 I'm noticing that SA seems to have a big problem with child
 processes just running away, never terminating and eating CPU.

 My mailservers can't cope, and I'm looking at having to switch
 off SA. (Not something I really want to do..)

 No matter what I set -m to spamd, they all just go into this
 endless death spiral..

When people ask why I haven't upgraded from 2.64 yet... I'm waiting until a 
week goes by without a new thread about runaway / way-slow / resource-eating SA 
3.0.X processes!  :-)

Pierre




Re: Runaway processes

2005-08-02 Thread nick

Pierre Thomson wrote:

I'm running SA 3.0.4 on OpenBSD with Perl 5.8.6  Exim V4.52.

I'm noticing that SA seems to have a big problem with child
processes just running away, never terminating and eating CPU.

My mailservers can't cope, and I'm looking at having to switch
off SA. (Not something I really want to do..)

No matter what I set -m to spamd, they all just go into this
endless death spiral..



When people ask why I haven't upgraded from 2.64 yet... I'm waiting until a 
week goes by without a new thread about runaway / way-slow / resource-eating SA 
3.0.X processes!  :-)

Pierre



It's good to know I'm not the only one with this issue.


Re: Runaway processes

2005-08-02 Thread Frank M. Cook



so you are running 30 per child and 6 children? 180 total. how 
many messages a day are you handling. I upped my children from 5 to 15 
thinking that would help but it hasn't. I was thinking of taken 
connections down to 5 or 6 on 15 children. maybe I have it 
backwards? I don't have anything else running on this computer at all so I 
was thinking I wanted to use up all the memory with children. is that off?

Frank M. CookAssociation Computer Services, Inc.http://www.acsplus.com



Re: Runaway processes

2005-08-02 Thread Mike Jackson
Sorry, no, that didn't come out right. There's only six children running at 
any time. Each will process 30 messages, then restart. The machine processed 
about 3200 messages yesterday, so each child restarted about once every 
2.5-3 hours.


Mike Jackson
Tech Administrator, Datahost
www.datahost.com


- Original Message - 
From: Frank M. Cook [EMAIL PROTECTED]

To: Mike Jackson [EMAIL PROTECTED]
Cc: users@spamassassin.apache.org
Sent: Tuesday, August 02, 2005 08:21
Subject: Re: Runaway processes


so you are running 30 per child and 6 children?  180 total.  how many 
messages a day are you handling.  I upped my children from 5 to 15 thinking 
that would help but it hasn't.  I was thinking of taken connections down to 
5 or 6 on 15 children.  maybe I have it backwards?  I don't have anything 
else running on this computer at all so I was thinking I wanted to use up 
all the memory with children. is that off?


Frank M. Cook
Association Computer Services, Inc.
http://www.acsplus.com



RE: Runaway processes

2005-08-02 Thread Herb Martin
  When people ask why I haven't upgraded from 2.64 yet... I'm waiting 
  until a week goes by without a new thread about runaway / 
 way-slow / 
  resource-eating SA 3.0.X processes!  :-)
 

I suspect your wait is over 3.10 (due any day now) + 1 week
should make you happy.

Improved thread handling and for me it works even in pre-Release.

--
Herb Martin




Re: runaway processes

2005-08-02 Thread Tom Gwilt

My setup is as follows:

FreeBSD 4.10, SpamAssassin 3.0.4, Perl 5.8

Using Bayes and a pile 'o SARE rules.

It scanned 34484 messages last night and the only time we see lags is when 
the bayes database is expiring.


The startup script is as follows:

/usr/local/bin/spamd --max-children=6 --max-conn-per-child=20 -d -x -u
daemon -s local0

HTH,

Tom


Forwarding mail address

2005-08-02 Thread Alexandre Cruz








Hi all,



I do
understand that this can sound as a very newbie question, however i have a
doubt that i cant find an answer. We are using Spamassassin with
procmail/sendmail. It is working fine, however, spam mail is being forwarded
for a mail account, which is no longer valid. Ive been looking where this
address is in the configuration, in order to forward those mails to another
account, but no luck. Any suggestion?



Best regards,

Alexandre
Cruzx










Re: Forwarding mail address

2005-08-02 Thread Matt Kettler
Alexandre Cruz wrote:
 Hi all,
 
  
 
 I do understand that this can sound as a very newbie question, however i
 have a doubt that i can’t find an answer. We are using Spamassassin with
 procmail/sendmail. It is working fine, however, spam mail is being
 forwarded for a mail account, which is no longer valid. I’ve been
 looking where this address is in the configuration, in order to forward
 those mails to another account, but no luck. Any suggestion?
 
  
 
 Best regards,

Look at your procmailrc.

SpamAssassin itself can't forward mail, so it's not going to be in any of the SA
config files.



Re: Forwarding mail address

2005-08-02 Thread Evan Platt

At 09:00 AM 8/2/2005, you wrote:

Hi all,

I do understand that this can sound as a very newbie question, 
however i have a doubt that i can't find an answer. We are using 
Spamassassin with procmail/sendmail. It is working fine, however, 
spam mail is being forwarded for a mail account, which is no longer 
valid. I've been looking where this address is in the configuration, 
in order to forward those mails to another account, but no luck. Any 
suggestion?



You won't find it.

Spamassassin doesn't forward mail. It scans mail. This is something 
that would need to be done on your mailer, or with a procmail recipe, 
depending on your mail setup. 



Re: Forwarding mail address

2005-08-02 Thread Mike Jackson

I do understand that this can sound as a very newbie question, however i
have a doubt that i can't find an answer. We are using Spamassassin with
procmail/sendmail. It is working fine, however, spam mail is being
forwarded for a mail account, which is no longer valid. I've been
looking where this address is in the configuration, in order to forward
those mails to another account, but no luck. Any suggestion?


Track the mail through every step it would take through your system at each 
step where it could change system usernames and/or be forwarded to another 
address:


1. virtusertable
2. aliases file(s)
3. .forward file in user's home directory
4. System-wide procmailrc
5. User-specific .procmailrc

Mike Jackson
Tech Administrator, Datahost
www.datahost.com 



Re: userpref with mysql does not work

2005-08-02 Thread Michael Parker
Martin Tanzer wrote:

 My setup:
 Debian 3.1 (sarge) with the provided spamassassin package (3.0.3-2)
 Postfix, spamassassin bound to postfix (no amavisd-new)
 There are no users on the machine, all mails are forwarded to another
 mailserver trough the transport file.

 Any ideas?

It seems pretty clear what is happening.  In your test above you did
the right thing, calling spamc with -u email address you want and of
course it worked correctly.  Now, when you are calling it via postfiix
you are no longer sending the correct address to spamc, either by not
using the -u command line param at all or just simply sending spam as
the username.  Fix how you are calling spamc and all will be well.

Michael



signature.asc
Description: OpenPGP digital signature


RE: Runaway processes

2005-08-02 Thread Pierre Thomson
Herb Martin wrote:
 When people ask why I haven't upgraded from 2.64 yet... I'm waiting
 until a week goes by without a new thread about runaway / way-slow /
 resource-eating SA 3.0.X processes!  :-)
 
 
 I suspect your wait is over 3.10 (due any day now) + 1 week
 should make you happy.
 
 Improved thread handling and for me it works even in pre-Release.

That's great news.  I'll give it a try after the initial shakedown.  I must 
add, however, that SA 2.64 with Spamcop URI (SURBL), Bayes, DCC and a dash of 
SARE has been doing a great job here, 98% - 99% of spam caught with minimal 
FP's.  Together with MailScanner and a virus scanner it's handling 15,000 
emails per day on an old 800MHz PIII box, with the load average usually in the 
0.30 range.  And it's rock-solid; I've never needed to kill an SA process in 
over a year of uptime.

Pierre


Re: Runaway processes

2005-08-02 Thread Frank M. Cook



are getting fat, is --max-con-per-child. It defaults to something 
quite large, and setting it down to 20 or so has helped many people.

is it better to run five children with 20 connections each, or 20 children 
with five connections each?
Frank M. CookAssociation Computer Services, Inc.http://www.acsplus.com



Re: Personal Bayes Score

2005-08-02 Thread Dhanny Kosasih

Matthew Yette wrote:


Dankos,

Put this into your /etc/mail/spamassassin/local.cf:

user_scores_sql_custom_querySELECT preference, value FROM _TABLE_
WHERE username = _USERNAME_ OR username = '@GLOBAL' OR username =
_DOMAIN_ ORDER BY username ASC

That will make per-user preferences priority, and then roll back to the
GLOBAL if the user doesn't have a preference specified.

 

If i running spamd with -u [user] option and use your configuration, 
GLOBAL configuration never used, is that correct ? If no, what is the 
correct parameter i must use ?


Regards,
dankos.




___ 
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com


Re: Runaway processes

2005-08-02 Thread Loren Wilton
 so you are running 30 per child and 6 children?  180 total.  how many
messages a day are you handling.  I upped my children from 5 to 15 thinking
that would help but it hasn't.  I was thinking of taken connections down to
5 or 6 on 15 children.  maybe I have it backwards?  I don't have anything
else running on this computer at all so I was thinking I wanted to use up
all the memory with children. is that off?

30 connections on 6 children is a reasonable number for many smaller sites,
the type that average probably less than 10K mails/day, at a guess.  It
should work reasonably well on the typical system with at least 500MB of
memory and a 500MHz or faster processor.

With a slower processor, or certainly with less memory, you might want to
take the number of children down, and possibly the number of connections.

Simple description on how this stuff works:

spamd fires off some number of children determined by -m, with the default
of 5.

Each child takes some amount of memory.  This is typically 30-60MB *per
child* depending on the number of rules files you have.  It will start a bit
smaller than that, and will typically grow over the first dozen or so mails.

If you have a lot of rules so your spamd children are taking 60MB each, 5 *
60 = 300MB.  You better have a 512MB system or larger or you will be in heap
big trouble.  Even at 30MB, 30 * 5 = 150MB.  This would probably work in a
256M system, but maybe not.  You might want -m 3 or so in this case.

Each child will process --max-conn-per-child messages before it dies and a
new child is created in its place.  If all mail was pretty much the same,
and if the children did nothing but process mail, this really shouldn't
matter.

But the real fact is that all mail isn't the same.  Some are very large.
They should be limited to 250K or so, but some programs like qmail don't
necessarily limit the mail size in the standard configuration.

It is NOT a direct relation from mail size to spamd child size!  A 250KB
mail might easily crank a child up to 250MB!

Once the child gets big, it just stays that way.  If you feed large mails to
SA, you cen get some really fat children.  5 children at 250MB each aren't
going to fit well in a 512MB system.

If you only let each child process a few messages before dying, if it
happens to process one large message and gets big, it will only stay big for
a few messages before going away.  Chances are relatively small that all the
children will manage to get fat at the same time, so you will probably
survive just fine.  With a large value of max con per child (like the
default) it is pretty easy to get all the children fat at once.

Spamd children also do other things than just process mail.  Like doing
database expiration runs.  These tend to get the children very fat,
especially have you have a database that has somehow gotten out of control.
Again, this causes Bad Things(tm) if it happens to a lot of the children at
once.

Loren



Re: Increase Performance howto

2005-08-02 Thread Loren Wilton
 I tested my qmail wtih more than 14000 spam (i used qmail-inject in my
 script). If i use QSheff + ClamAV + SpamAssassin, my server process
 14000 emails in 1 hour, and if i only use qmail my server process 14000
 emails in 1/3 hours. How can i increase my server performance ? I don't
 understand what 'max-connection' and '-m' for, can u tell me what is that
?

I just did a long reply on ths subject, look at the trhead 'runaway
processes'.

Loren



Re: Runaway processes

2005-08-02 Thread Loren Wilton
 is it better to run five children with 20 connections each, or 20 children
with five connections each?

Pretty much answered in my following mail.  In general each child might us
30-60mb under NORMAL circumstances, so the amount of memory on your machine
will determine the upper limit for number of children.

In most cases you shouldn't really need less than about 20 connections
(mails processed before dying) per child.  If you do it may be a sign of
other configuration problems in the system, such as not limiting the size of
large mails going through SA.

Loren

PS: Could you post plain text rather than html if convenient?  OE makes
quoting from HTML a bloody pain.  :-(



Re: Forwarding mail address

2005-08-02 Thread jdow
From: Alexandre Cruz [EMAIL PROTECTED]

 Hi all,
  
 I do understand that this can sound as a very newbie question, however i
 have a doubt that i can't find an answer. We are using Spamassassin with
 procmail/sendmail. It is working fine, however, spam mail is being
 forwarded for a mail account, which is no longer valid. I've been
 looking where this address is in the configuration, in order to forward
 those mails to another account, but no luck. Any suggestion?

Is fetchmail involved? If so then you might have to change contents
of either /etc/fetchmailrc or that person's account's .fetchmailrc
file.

For the /etc/fetchmailrc case all you need to do is redirect that
person's email by changing the local address stanza. If a fetchmail
is running for each account then you would need to disable that
account's fetchmail startup, where ever that happens. Then add
lines to some other account's .fetchmailrc to poll for and
receive the mail instead.

If you are not using fetchmail you need to punt a little. The sendmail
(or substitute) files might need an alias on that account. Others can
suggest tactics for this case.

{^_^}



Re: Increase Performance howto

2005-08-02 Thread jdow
From: Dhanny Kosasih [EMAIL PROTECTED]

 I tested my qmail wtih more than 14000 spam (i used qmail-inject in my
 script). If i use QSheff + ClamAV + SpamAssassin, my server process
 14000 emails in 1 hour, and if i only use qmail my server process 14000
 emails in 1/3 hours. How can i increase my server performance ? I don't
 understand what 'max-connection' and '-m' for, can u tell me what is that
?

If you do not already have a large amount of memory then adding
memory is one of the sovereign cures for slow SpamAssassin. As soon
as it goes to swap you're dead.

More processor also helps.

Fewer rule sets leads to poorer filtering and faster operation.

You are already processing much faster than my 1GHz Athlon which has
a gigabyte of ram. With all the rules I run it takes on the order of
a second and a half to scan for single messages. With multiple messages
at once there is some net advantage to the multiprocessing that happens.

It may be time to split the server into two machines.
{^_^}




Re: Load balancing spamd

2005-08-02 Thread email builder


--- Jason Frisvold [EMAIL PROTECTED] wrote:

 On 8/1/05, email builder [EMAIL PROTECTED] wrote:
  Even if I had forgotten the -A, I think I would have been seeing
 connection
  refused notices, but right now, it just seems to time out.  I'm pretty
 sure
  this is a LVS question more than a spamc/d question, since I've no
 problems
  with the latter -- I am only asking here to see if anyone else does SA
  weighted load balancing.
 
 I kinda went the other way around..  I have multiple mail machines,
 each with their own instance of spamd.  I use a Cisco 7206 VXR to do
 the load balancing.  Works like a charm.

Wow, a bit out of our price range here.  :)  

We have also considered just continuing to build out MTA boxes each with an
Amavis/Clamd and SA on them to share our increasing load (just use LVS to
balance the incoming SMTP traffic and there is little reason to worry about
balancing SA or Amavis/Clam), but our first choice is to split the layers
-- have a couple separate machines that just do MTA-ish things, and a
separate set of boxes that serve as a SA (and Clam-av) farm.  The thing
that's better about doing it that way is the redundancy that you don't get if
you aren't sharing spamd instances across all your MTA machines.  

Technically, this should be feasible with just plain DNS load balancing, but
in our current medium/low budget scenario, we don't have the rackspace to
have numerous boxes that are dedicated ONLY to SA/clam, thus our desire is to
figure out a way to *WEIGHT* our spamd balancing.

I'm surprised there's not a lot of folks out there who have done this
before?

Thanks again!





Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 


RE: Load balancing spamd

2005-08-02 Thread Gary W. Smith
We have 4 front end servers running postfix.  These servers call and AV
process on two additional AV servers behind the wall.  Then these
servers call spamd on two additional servers behind the wall.  Those two
servers have a simple MySQL cluster (running Linux-HA and DRBD).  

In all we have 8 boxes that handle all of our email for our clients.  We
are generating about 170k emails per day coming into the network.  We
recently upgrade all of the hardware to Dell Dimension 4700's with 1.5gb
ram each.  Budget was $5200.  

Machines are idle.  

Something new we have been looking at as well.  We are looking at
setting up simple relays that will run RBL on the front end and then
just hand them off to our 4 backend servers.  But since it works right
now we're not going to fix it.

 -Original Message-
 From: email builder [mailto:[EMAIL PROTECTED]
 Sent: Tuesday, August 02, 2005 5:19 PM
 To: Jason Frisvold
 Cc: Gary W. Smith; users@spamassassin.apache.org
 Subject: Re: Load balancing spamd
 
 
 
 --- Jason Frisvold [EMAIL PROTECTED] wrote:
 
  On 8/1/05, email builder [EMAIL PROTECTED] wrote:
   Even if I had forgotten the -A, I think I would have been seeing
  connection
   refused notices, but right now, it just seems to time out.  I'm
pretty
  sure
   this is a LVS question more than a spamc/d question, since I've no
  problems
   with the latter -- I am only asking here to see if anyone else
does SA
   weighted load balancing.
 
  I kinda went the other way around..  I have multiple mail machines,
  each with their own instance of spamd.  I use a Cisco 7206 VXR to do
  the load balancing.  Works like a charm.
 
 Wow, a bit out of our price range here.  :)
 
 We have also considered just continuing to build out MTA boxes each
with
 an
 Amavis/Clamd and SA on them to share our increasing load (just use LVS
to
 balance the incoming SMTP traffic and there is little reason to worry
 about
 balancing SA or Amavis/Clam), but our first choice is to split the
 layers
 -- have a couple separate machines that just do MTA-ish things, and a
 separate set of boxes that serve as a SA (and Clam-av) farm.  The
thing
 that's better about doing it that way is the redundancy that you don't
get
 if
 you aren't sharing spamd instances across all your MTA machines.
 
 Technically, this should be feasible with just plain DNS load
balancing,
 but
 in our current medium/low budget scenario, we don't have the rackspace
to
 have numerous boxes that are dedicated ONLY to SA/clam, thus our
desire is
 to
 figure out a way to *WEIGHT* our spamd balancing.
 
 I'm surprised there's not a lot of folks out there who have done this
 before?
 
 Thanks again!
 
 
 
 
 
 Start your day with Yahoo! - make it your home page
 http://www.yahoo.com/r/hs
 


Re: Load balancing spamd

2005-08-02 Thread email builder


--- Charles Sprickman [EMAIL PROTECTED] wrote:

 On Tue, 2 Aug 2005, email builder wrote:
 
  Technically, this should be feasible with just plain DNS load balancing,
 but
  in our current medium/low budget scenario, we don't have the rackspace to
  have numerous boxes that are dedicated ONLY to SA/clam, thus our desire
 is to
  figure out a way to *WEIGHT* our spamd balancing.
 
 I've been very happy with DNS load balancing.  The frontend mxer runs 
 tinydns on a local zone blah.local.domain.com, and an instance of 
 dnscache with the round-robin patch is pointed to in resolv.conf.  While I 
 thought that the load balancing would be a little rough, looking at the 
 stats I sent 17011 messages through #1, 17025 through #2, and 17016 
 through #3 yesterday.  I can also weight this by having multiple records, 
 ie:
 
 spamd1 gets three identical entries in tinydns
 spamd2 gets three identical entries in tinydns
 spamd3 gets three identical entries in tinydns
 spamd4 gets one entry

O, some good bits!  We have always been plenty satisfied with Bind, but
maybe this is the straw that broke the camel's back  unless anyone knows
if Bind will behave the same way if we have multiple entries for one host??

 
 that will leave spamd4 seeing about 1/3 the load of the other boxes.  It's 
 not clustering, but when using the -d flag:
 
 -d host
Connect to spamd server on given host.  If host resolves to multi-
ple addresses, then spamc will fail-over to the other addresses, if
the first one cannot be connected to.
 
 it should hit another box if one goes down.  Or some easy scripting could 
 remove the appropriate entries from tinydns if one machine stops 
 responding.
 
 Speaking of low budget, we have three SA boxes, each of which has a 2GHz 
 AMD processor, 1GB RAM.  The first two cost about $550, the last one about 
 $425.  They are pretty crappy boxes with no RAID, etc., but it's cheaper 
 for me to keep one more box than needed in the equation than to build out 
 a few uber spamd boxes.  They are in mini-atx cases, so they barely take 
 up more room than an equivalent number of 1U boxes. I spawn 30 spamd 
 children on each.  I have been very happy with the performance so far.
 
  I'm surprised there's not a lot of folks out there who have done this
  before?
 
 Maybe they're all cheap like me. :)

Awesome!  Thanks for the advice!!!

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Bayes: not enough usable tokens found

2005-08-02 Thread Mike Cavanagh

What does this message mean??
   debug: cannot use bayes on this message; not enough usable tokens found
   debug: bayes: not scoring message, returning undef

I am using MimeDefang Ver. 2.52 and SpamAssassin Ver. 3.0.4

Below is:
   current status of bayes database (sa-learn --dump=magic)
   sa-mimedefang.cf
   spamassassin --lint --debug

What am I doing wrong?  I am sure this is something simple, I just can't 
seem to see it.

Thanks,
Mike.

*
SA-LEARN Status:
/usr/local/bin/sa-learn --username=mimedefang 
--siteconfigpath=/etc/mail/spamassassin --dump=magic

0.000  0  3  0  non-token data: bayes db version
0.000  0   4275  0  non-token data: nspam
0.000  0765  0  non-token data: nham
0.000  0 148928  0  non-token data: ntokens
0.000  0 1120235107  0  non-token data: oldest atime
0.000  0 1123040192  0  non-token data: newest atime
0.000  0 1123030366  0  non-token data: last journal 
sync atime

0.000  0 1123000571  0  non-token data: last expiry atime
0.000  02764800  0  non-token data: last expire 
atime delta
0.000  0   2580  0  non-token data: last expire 
reduction count


*
Sa-mimedefang.cf:
required_hits   10
ok_locales  en, zh
skip_rbl_checks 0   # Go ahead and check anyways
use_bayes 1
bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam 0.1
bayes_auto_learn_threshold_spam 12.0
bayes_learn_during_report 1
bayes_path /etc/mail/spamassassin/bayes
bayes_file_mode 0700
bayes_min_ham_num 200
bayes_min_spam_num 200
bayes_use_hapaxes 1
bayes_use_chi2_combining 1
bayes_auto_expire 1
bayes_learn_to_journal 0
bayes_journal_max_size 102400
use_dcc 1
use_pyzor 1
use_razor2 1

*
Spamassassin Lint:
spamassassin -D --lint --siteconfigpath=/etc/mail/spamassassin
debug: SpamAssassin version 3.0.4
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/ccs/bin', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: PATH included '/opt/sfw/bin', keeping.
debug: Final PATH set to: 
/usr/sbin:/usr/bin:/usr/ccs/bin:/usr/local/bin:/opt/sfw/bin


debug: diag: module not installed: DBI ('require' failed)

debug: diag: module installed: DB_File, version 1.811
debug: diag: module installed: Digest::SHA1, version 2.07
debug: diag: module installed: IO::Socket::UNIX, version 1.21
debug: diag: module installed: MIME::Base64, version 3.03
debug: diag: module installed: Net::DNS, version 0.46

debug: diag: module not installed: Net::LDAP ('require' failed)

debug: diag: module installed: Razor2::Client::Agent, version 2.40
debug: diag: module installed: Storable, version 2.09
debug: diag: module installed: URI, version 1.30
debug: ignore: using a test message to lint rules
debug: using /opt/sfw/share/spamassassin for default rules dir
debug: config: read file /opt/sfw/share/spamassassin/10_misc.cf
debug: config: read file /opt/sfw/share/spamassassin/20_anti_ratware.cf
debug: config: read file /opt/sfw/share/spamassassin/20_body_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/20_compensate.cf
debug: config: read file /opt/sfw/share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/20_drugs.cf
debug: config: read file /opt/sfw/share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/20_head_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/20_html_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/20_meta_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/20_phrases.cf
debug: config: read file /opt/sfw/share/spamassassin/20_porn.cf
debug: config: read file /opt/sfw/share/spamassassin/20_ratware.cf
debug: config: read file /opt/sfw/share/spamassassin/20_uri_tests.cf
debug: config: read file /opt/sfw/share/spamassassin/23_bayes.cf
debug: config: read file /opt/sfw/share/spamassassin/25_body_tests_es.cf
debug: config: read file /opt/sfw/share/spamassassin/25_hashcash.cf
debug: config: read file /opt/sfw/share/spamassassin/25_spf.cf
debug: config: read file /opt/sfw/share/spamassassin/25_uribl.cf
debug: config: read file /opt/sfw/share/spamassassin/30_text_de.cf
debug: config: read file /opt/sfw/share/spamassassin/30_text_fr.cf
debug: config: read file /opt/sfw/share/spamassassin/30_text_nl.cf
debug: config: read file /opt/sfw/share/spamassassin/30_text_pl.cf
debug: config: read file /opt/sfw/share/spamassassin/50_scores.cf
debug: config: read file /opt/sfw/share/spamassassin/60_whitelist.cf
debug: using /etc/mail/spamassassin 

RE: Load balancing spamd

2005-08-02 Thread email builder


--- Gary W. Smith [EMAIL PROTECTED] wrote:

 We have 4 front end servers running postfix.  These servers call and AV
 process on two additional AV servers behind the wall.  Then these
 servers

these being the AV server calls spamd or it goes back to the MTA first?

How do you (make and) balance the calls to the AV servers?  How do you (make
and) balance the calls to the spamd machines?  I am very interested in these
details!

 call spamd on two additional servers behind the wall.  Those two
 servers have a simple MySQL cluster (running Linux-HA and DRBD).  
 
 In all we have 8 boxes that handle all of our email for our clients.  We
 are generating about 170k emails per day coming into the network.

We are edging up to 95K a day now on only two machines.  You can imagine we
are anxious to start using the other boxes we have rarin' to go!

 We
 recently upgrade all of the hardware to Dell Dimension 4700's with 1.5gb
 ram each.  Budget was $5200.  
 
 Machines are idle.  

Sweet.  ;)
 
 Something new we have been looking at as well.  We are looking at
 setting up simple relays that will run RBL on the front end and then
 just hand them off to our 4 backend servers.  But since it works right
 now we're not going to fix it.

Why?  Because your DNS costs to query your RBL list in Postfix is very
heavy/slowing you down?  Are you going to mirror just one chosen RBL out
there or a combination of several??

Do you run DCC in your SA environment?  If so, you are over their recommended
limit for hosting a DCC server (we are nearing it - 100K a day I think).  Do
you run a DCC server for yourself?  Any issues to be aware of?

Thanks a TON!!


 
  -Original Message-
  From: email builder [mailto:[EMAIL PROTECTED]
  Sent: Tuesday, August 02, 2005 5:19 PM
  To: Jason Frisvold
  Cc: Gary W. Smith; users@spamassassin.apache.org
  Subject: Re: Load balancing spamd
  
  
  
  --- Jason Frisvold [EMAIL PROTECTED] wrote:
  
   On 8/1/05, email builder [EMAIL PROTECTED] wrote:
Even if I had forgotten the -A, I think I would have been seeing
   connection
refused notices, but right now, it just seems to time out.  I'm
 pretty
   sure
this is a LVS question more than a spamc/d question, since I've no
   problems
with the latter -- I am only asking here to see if anyone else
 does SA
weighted load balancing.
  
   I kinda went the other way around..  I have multiple mail machines,
   each with their own instance of spamd.  I use a Cisco 7206 VXR to do
   the load balancing.  Works like a charm.
  
  Wow, a bit out of our price range here.  :)
  
  We have also considered just continuing to build out MTA boxes each
 with
  an
  Amavis/Clamd and SA on them to share our increasing load (just use LVS
 to
  balance the incoming SMTP traffic and there is little reason to worry
  about
  balancing SA or Amavis/Clam), but our first choice is to split the
  layers
  -- have a couple separate machines that just do MTA-ish things, and a
  separate set of boxes that serve as a SA (and Clam-av) farm.  The
 thing
  that's better about doing it that way is the redundancy that you don't
 get
  if
  you aren't sharing spamd instances across all your MTA machines.
  
  Technically, this should be feasible with just plain DNS load
 balancing,
  but
  in our current medium/low budget scenario, we don't have the rackspace
 to
  have numerous boxes that are dedicated ONLY to SA/clam, thus our
 desire is
  to
  figure out a way to *WEIGHT* our spamd balancing.
  
  I'm surprised there's not a lot of folks out there who have done this
  before?
  
  Thanks again!
  
  
  
  
  
  Start your day with Yahoo! - make it your home page
  http://www.yahoo.com/r/hs
  
 




__ 
Yahoo! Mail for Mobile 
Take Yahoo! Mail with you! Check email on your mobile phone. 
http://mobile.yahoo.com/learn/mail 


Re: Bayes: not enough usable tokens found

2005-08-02 Thread Loren Wilton
 What does this message mean??
 debug: cannot use bayes on this message; not enough usable tokens
found
 debug: bayes: not scoring message, returning undef

Unless you are seeing this a whole lot, I don't think you are doing anything
wrong.  I think this just means that the particular mail didn't much match
anything Bayes had seen before, so it didn't feel competent to assign a
score to it.  I would have expected that to be a bayes_50 case, but it looks
like it just decided to bypass the message.

Loren



Re: Bayes: not enough usable tokens found

2005-08-02 Thread Mike Cavanagh




Hum. I can see some messages are being caught via the Bayes test, but
I would think Bayes would find many more as I have close to 5000 SPAM
in the Bayes system.
I get at most 15 messages a day flagged as SPAM while I receive approx.
100 messages a day as non-SPAM but should be flagged as SPAM.

I have started to include the Spamassassin footer on all messages to
get a handle on what passes in the "non-Spam" messages.

Any thoughts on how to improve this would be helpful.

Thanks,
Mike


Loren Wilton wrote:

  
What does this message mean??
debug: cannot use bayes on this message; not enough usable tokens

  
  found
  
  
debug: bayes: not scoring message, returning undef

  
  
Unless you are seeing this a whole lot, I don't think you are doing anything
wrong.  I think this just means that the particular mail didn't much match
anything Bayes had seen before, so it didn't feel competent to assign a
score to it.  I would have expected that to be a bayes_50 case, but it looks
like it just decided to bypass the message.

Loren

  



Spam detection software, running on the system fred.5cs.com, has
identified this incoming email as possible spam.  The original message
has been attached to this so you can view it (if it isn't spam) or label
similar future email.  If you have any questions, see
[EMAIL PROTECTED] for details.

Content preview:  Hum. I can see some messages are being caught via the
  Bayes test, but I would think Bayes would find many more as I have
  close to 5000 SPAM in the Bayes system. I get at most 15 messages a day
  flagged as SPAM while I receive approx. 100 messages a day as non-SPAM
  but should be flagged as SPAM. [...] 

Content analysis details:   (-5.9 points, 10.0 required)

 pts rule name  description
 -- --
-3.3 ALL_TRUSTEDDid not pass through any untrusted hosts
 0.0 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 HTML_TITLE_EMPTY   BODY: HTML title contains no text
-2.6 BAYES_00   BODY: Bayesian spam probability is 0 to 1%
[score: 0.]




OT: RBL for dynamic no reverse DNS lookups

2005-08-02 Thread Rob McEwen
OT: RBL for dynamic no reverse DNS lookups

I'm trying to find an RBL which will return a standard RBL return code (like
127.0.0.2) if/when the IP passed to the RBL doesn't have a reverse DNS
entry.

(1) I know that SA doesn't have a need for this as another function is
already available in SA for this. But I need this for a **different**
utility, not SA (which is why I said, OT).

(2) This other utility doesn't have the option to check for no reverse
DNS, but CAN do whatever general RBL lookups I tell it to do. Also, I don't
have access to this utility's source code.  However, if I can find this kind
of RBL I mentioned, then I can use this utility's RBL lookups against that
kind of RBL to accomplish checking a message's sending server for no
reverse DNS. But, again, doing lookups on (reversedIP).in-addr.arpa is NOT
an option in this utility because it **only** works with the traditional RBL
responses, which are always numeric, unlike reverse DNS lookups.

(3) I know that some aggressive RBLs factor in no reverse DNS... but,
instead, I'm looking for an RBL which would do a DYNAMIC lookup to see if
there is no reverse DNS, even if that RBL hasn't checked that IP before or
hasn't previously added that IP to it's no reverse DNS nameserver
database.

(4) And, of course, I understand that it is NOT a good idea to block
**solely** due to a sending server's IP not having a reverse DNS lookup.
Rather, I'm using this for auditing, testing, and other things.

Thanks,

Rob McEwen
PowerView Systems



Re: Bayes: not enough usable tokens found

2005-08-02 Thread Daryl C. W. O'Shea

Mike Cavanagh wrote:
Hum.  I can see some messages are being caught via the Bayes test, but I 
would think Bayes would find many more as I have close to 5000 SPAM in 
the Bayes system.
I get at most 15 messages a day flagged as SPAM while I receive approx. 
100 messages a day as non-SPAM but should be flagged as SPAM.


I have started to include the Spamassassin footer on all messages to get 
a handle on what passes in the non-Spam messages.


Any thoughts on how to improve this would be helpful.





 pts rule name  description
 -- --
-3.3 ALL_TRUSTEDDid not pass through any untrusted hosts



http://wiki.apache.org/spamassassin/TrustPath



Re: Bayes: not enough usable tokens found

2005-08-02 Thread Loren Wilton
Hum.  I'm a little confused by that SA score stuff on the bottom of the
message.  If it refers to a message that should be spam you have two serious
problems.  If it referred to a message from this list you may have a serious
problem and a less serious problem.

 pts rule name  description
 -- 
--
-3.3 ALL_TRUSTEDDid not pass through any untrusted hosts
 0.0 HTML_30_40 BODY: Message is 30% to 40% HTML
 0.0 HTML_MESSAGE   BODY: HTML included in message
 0.0 HTML_TITLE_EMPTY   BODY: HTML title contains no text
-2.6 BAYES_00   BODY: Bayesian spam probability is 0 to 1%

In general ALL_TRUSTED shouldn't be firing for messages coming from an
external source.  This makes me wonder if you have trusted_hosts and
trusted_networks set correctly.

In general SA (and especially Bayes) shouldn't be seeing this list, since it
has a lot of real spam floating through it, and other spammy tokens.  It is
far better to use postfix or whatever your router is to bypass this list
around SA.

If that header referred to a spam, BAYES_00 says that Bayes thought it was
guaranteed ham.  That would be a sign that you have a corrupted bayes
database.

Loren