R: AWL and whitelists

2006-10-27 Thread Giampaolo Tomassoni
 Hi all!
 I don't understand something in AWL working and want somebody 
 clears it for
 me. 
 I know that AWL is a score averaging system and it's bad idia to use it as
 whitelist, but there is possibility --add-to-whitelist(-W) to add 
 e-mail to
 AWL with -100 score. This possibility works very strange.
 I use sql-backend and see last changes.
 
 My actions:
 1. Add email in awl with score -100
 spamassassin -W test-email
 In awl appears new row: email|none|1|-100 (e-mail|ip|count|score)
 
 2. I do first test check:
 cat test-email | spamc -R
 Content analysis details:   (-49.4 points, 5.0 required)
  pts rule name  description
  --
 --
  0.6 HTML_SHORT_LENGTH  BODY: HTML is extremely short
  0.0 HTML_MESSAGE   BODY: HTML included in message
  0.5 DNS_FROM_RFC_ABUSE RBL: Envelope sender in abuse.rfc-ignorant.org
  -51 AWLAWL: From: address is in the auto white-list
 In awl row: email|62.234|1|1.109 (e-mail|ip|count|score)
 Something strange: count is 1 still and score has become 1.109 :( 

What happened to the email|none row? Is it still there?


 3. I do second check 
 cat test-email | spamc -R
 Content analysis details:   (1.1 points, 5.0 required)
  pts rule name  description
  --
 --
  0.6 HTML_SHORT_LENGTH  BODY: HTML is extremely short
  0.0 HTML_MESSAGE   BODY: HTML included in message
  0.5 DNS_FROM_RFC_ABUSE RBL: Envelope sender in abuse.rfc-ignorant.org
 In awl row: email|62.234|2|2.218 (e-mail|ip|count|score)
 Where is AWL check in the second check report? 

Again, se if an email|null is still there

Anyway, to whitelist a message source, use whitelist_from or, even better, 
whitelist_from_rcvd in your .cf file.

If you use AWL to do this, your whitelist score may get consumed after a 
while.

Giampaolo


 I expected that address will have negative score after 
 add-to-whitelist, but
 it works only for a one trying. The second(and further) trying it doesn't
 work
 Why awl doesn't work as it must work - it must smoothly change 
 score - e.g.
 -49.4, -25, -15 and so on... But it doesn't. 
 
 Where am i not right?
 
 p.s.
 My system:
 FreeBSD 5.4/Spamassassin 3.1.5/MySQL 4.1.18
 -- 
 View this message in context: 
 http://www.nabble.com/AWL-and-whitelists-tf2518983.html#a7025648
 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
 



Welcome to test russian ruleset

2006-10-27 Thread sa-russian
Hi everybody!
Welcome to test russian ruleset for SpamAssassin. The ruleset file can be 
downloaded from the URL:
http://sa-russian.narod.ru/99_russian_re.cf
The ruleset reflects the list of tokens, often found in russian spam. The list 
of tokens is available at URL (KOI8-R encoding):
http://sa-russian.narod.ru/tokens
The ruleset was testet on two linux boxes with Perl 5.6 and 5.8.
Comments and reports are gratefully appreciated.
Best regards.
Alan M. Makoev
-- 


Re: R: AWL and whitelists

2006-10-27 Thread Roman Sozinov


Giampaolo Tomassoni wrote:
 
 2. I do first test check:
 cat test-email | spamc -R
 Content analysis details:   (-49.4 points, 5.0 required)
  pts rule name  description
  --
 --
  0.6 HTML_SHORT_LENGTH  BODY: HTML is extremely short
  0.0 HTML_MESSAGE   BODY: HTML included in message
  0.5 DNS_FROM_RFC_ABUSE RBL: Envelope sender in
 abuse.rfc-ignorant.org
  -51 AWLAWL: From: address is in the auto white-list
 In awl row: email|62.234|1|1.109 (e-mail|ip|count|score)
 Something strange: count is 1 still and score has become 1.109 :( 
 
 What happened to the email|none row? Is it still there?
 
Yes it still there:
email|62.234|1|1.109 (e-mail|ip|count|score)


Giampaolo Tomassoni wrote:
 
 3. I do second check 
 cat test-email | spamc -R
 Content analysis details:   (1.1 points, 5.0 required)
  pts rule name  description
  --
 --
  0.6 HTML_SHORT_LENGTH  BODY: HTML is extremely short
  0.0 HTML_MESSAGE   BODY: HTML included in message
  0.5 DNS_FROM_RFC_ABUSE RBL: Envelope sender in
 abuse.rfc-ignorant.org
 In awl row: email|62.234|2|2.218 (e-mail|ip|count|score)
 Where is AWL check in the second check report? 
 
 Again, se if an email|null is still there
 
 Anyway, to whitelist a message source, use whitelist_from or, even better,
 whitelist_from_rcvd in your .cf file.
 
 If you use AWL to do this, your whitelist score may get consumed after a
 while.
 
 Giampaolo
 

I can't use .cf file because it's dynamic mail system and users can alert
about spam/not spam with the special buttons in webmail (like gmail).
I don't understand how AWL works with '--add-to-whitelist' function
-- 
View this message in context: 
http://www.nabble.com/AWL-and-whitelists-tf2518983.html#a7026194
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



RE: Welcome to test russian ruleset

2006-10-27 Thread vitas1

Hello!

On Spamassassin 3.1.4 I've got
the following errors while executing spamassassin --lint:


[3403] warn: config: invalid regexp
for rule BODY_KOI8_82: 

.

[3403] warn: config: invalid regexp
for rule BODY_WIN1251_82: 

.

[3403] warn: config: warning: score
set for non-existent rule BODY_WIN1251_82
[3403] warn: config: warning: score
set for non-existent rule BODY_KOI8_82


WBR, 

Vitaly.

Re: FW: spamd scan problem

2006-10-27 Thread John Andersen
On Friday 27 October 2006 00:48, Frank van den Diepstraten wrote:
 (sorry for duplicate mails)

 Hi all,

 I've got a question about spamassasin. I've got 2 mailservers with an
 identical installation. 

 HTML_60_70,HTML_IMAGE_ONLY_24,HTML_MESSAGE,NO_REAL_NAME
 scantime=0.2, 

 ALL_TRUSTED,IP_LINK_PLUS,NORMAL_HTTP_TO_IP,RAZOR2_CF_RANGE_51_100,RAZOR2_CH
E CK,TW_JT
 scantime=5.2

Do you run razor on the first system?  Or any other network
tests?  If not, that alone may be the answer.

Now it could just be that your examples came out with no 
razor in the first one, but I gotta ask.


-- 
_
John Andersen


RE: FW: spamd scan problem

2006-10-27 Thread Frank van den Diepstraten
Thnx for your response. I think thats the problem because when a grep for
RAZOR in de bad systems mail.log I get full pages. When I do the same on
the good system there's no output. But now the question is where I can
disable this razor thing...

Regards,

Frank.

-Oorspronkelijk bericht-
Van: John Andersen [mailto:[EMAIL PROTECTED]
Verzonden: vrijdag 27 oktober 2006 11:08
Aan: users@spamassassin.apache.org
Onderwerp: Re: FW: spamd scan problem


On Friday 27 October 2006 00:48, Frank van den Diepstraten wrote:
 (sorry for duplicate mails)

 Hi all,

 I've got a question about spamassasin. I've got 2 mailservers with an
 identical installation.

 HTML_60_70,HTML_IMAGE_ONLY_24,HTML_MESSAGE,NO_REAL_NAME
 scantime=0.2,


ALL_TRUSTED,IP_LINK_PLUS,NORMAL_HTTP_TO_IP,RAZOR2_CF_RANGE_51_100,RAZOR2_CH
E CK,TW_JT
 scantime=5.2

Do you run razor on the first system?  Or any other network
tests?  If not, that alone may be the answer.

Now it could just be that your examples came out with no
razor in the first one, but I gotta ask.


--
_
John Andersen



RE: FW: spamd scan problem

2006-10-27 Thread Frank van den Diepstraten
ok I understand that, but I wan't to know if this causes the problem. So I
want to trie it out without that razor thing... But I can't find the config
where it's enabled in.

Regards,

Frank.

-Oorspronkelijk bericht-
Van: John Andersen [mailto:[EMAIL PROTECTED]
Verzonden: vrijdag 27 oktober 2006 11:36
Aan: users@spamassassin.apache.org
Onderwerp: Re: FW: spamd scan problem


On Friday 27 October 2006 01:32, Frank van den Diepstraten wrote:
 But now the question is where I can
 disable this razor thing...

No no, you want to ENABLE it on the good system.

Razor is wounderfull.  It just takes a little bit of time, but not
a great deal of CPU load.

Razor catches a lot of spam with almost a non-existant
false positive rate.

--
_
John Andersen



Re: Rules to reject bounce messages for mail not sent by me

2006-10-27 Thread Justin Mason

existing set: http://wiki.apache.org/spamassassin/VBounceRuleset
;)

--j.

Nick Gilbert writes:
 Hi,
 
 I've been trying to write some SA rules to reject bounce messages which 
 I did not send.
 
 I've made a good start, but some bounce messages still get through but I 
 don't understand why.
 
 The theory is that viruses and spammers don't seem to use my full e-mail 
 address [EMAIL PROTECTED] but change the username part of it and send 
 from an address [EMAIL PROTECTED] I would like to reject all bounce 
 messages which have arisen from mail sent from [EMAIL PROTECTED] but NOT 
 [EMAIL PROTECTED]
 
 This works for about 50% of mail, but I think one serious problem is 
 that the line:
 
 header  __NICK_BOUNCE_REAL  To =~ /[EMAIL PROTECTED]/i
 
 ...matches on the header:
 
 X-MDaemon-Deliver-To: [EMAIL PROTECTED]
 
 Which I'm pretty sure it shouldn't! Why does it think that header is the 
 same as a normal To header? Surely it's not scanning for headers simply 
 ending in To?
 
 My rules are below for comment/improvement but please let me know if 
 there's a better way to do this or an existing set of working rules 
 somewhere.
 
 Nick...
 
 
 # -- BOUNCE DETECTION (stolen from
 # bogus_virus_warnings.cf)-
 # General rule to indicate bounce or otherwise - used for some other
 # rules
 header __BOUNCE_HEADER  X-Is-A-Bounce =~ /.+/
 
 # This won't match for scanning done at SMTP time, at least with Exim
 header __BOUNCE_RP1 Return-Path =~  /^$/
 
 # NL says this is added by amavisd-new before passing to SA
 header __BOUNCE_RP2 X-Return-Path =~ /^$/
 
 # Mark Martinec says the above is incorrect, and it's X-Envelope-From
 header __BOUNCE_RP3 X-Envelope-From =~ /^$/
 
 meta __NULL_SENDER  __BOUNCE_HEADER || __BOUNCE_RP1 || 
 __BOUNCE_RP2 || __BOUNCE_RP3
 
 # Thanks to AF
 header __CT_DEL_STATUS  Content-Type =~ 
 /report-type=delivery-status/
 
 meta __NICK_IS_A_BOUNCE __NULL_SENDER || __CT_DEL_STATUS
 
 header  __NICK_BOUNCE_REAL  To =~ /[EMAIL PROTECTED]/i
 header  __NICK_TO_NOT_METo =~ /[EMAIL PROTECTED]/i
 
 meta NICK_SPOOF_BOUNCE (( __NICK_IS_A_BOUNCE  __NICK_TO_NOT_ME)  
 (!__NICK_BOUNCE_REAL))
 score NICK_SPOOF_BOUNCE 10
 describe  NICK_SPOOF_BOUNCE Attached bounce contains my address but I 
 never sent this!


mcafee-spamassassin-rules

2006-10-27 Thread Johann Spies
We are using Mcafee's anti-virus product on our mailservers and we
mirror their files from ftp.nai.com on an hourly basis. Today I saw
something that I did not realise they provide:
mcafee-spamassassin-perl-1.0.2620-1.5002.i386.rpm
mcafee-spamassassin-rules-1.0.2620-2620.5002.i386.rpm

I thought that if they provide updated rules on a daily basis, I can
just as well try and use those rules.  However, they were written for
version 2.6 and 3.0.3-2sarge1 is complaining about those rules.

Is there a way to utilize their updates with the later versions of
spamassassin?  Or do I have to use there version of spamassassin to do
so?  Would that be advisable?

Regards
Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 If a man abide not in me, he is cast forth as a  
  branch, and is withered; and men gather them, and cast
  them into the fire, and they are burned. 
 John 15:6 


Re: Rules to reject bounce messages for mail not sent by me

2006-10-27 Thread Nick Gilbert

Justin Mason wrote:

existing set: http://wiki.apache.org/spamassassin/VBounceRuleset
;)


Thanks!

One thing I'm not sure about - that module produces two rules. How 
should I score the rules so that real bounces aren't rejected but the 
fake ones are?


I presume I do it this way round:

score BOUNCE_MESSAGE  10
score ANY_BOUNCE_MESSAGE 0.1

I presume BOUNGE_MESSAGE is only true if the bounce is for a mail not 
sent by me? If so, I'm surprised the rule name isn't 
SPOOF_BOUNCE_MESSAGE or similar.


My mail server rejects messages with spam scores of 10 or above.

Nick...


Nick Gilbert writes:

Hi,

I've been trying to write some SA rules to reject bounce messages which 
I did not send.


I've made a good start, but some bounce messages still get through but I 
don't understand why.


The theory is that viruses and spammers don't seem to use my full e-mail 
address [EMAIL PROTECTED] but change the username part of it and send 
from an address [EMAIL PROTECTED] I would like to reject all bounce 
messages which have arisen from mail sent from [EMAIL PROTECTED] but NOT 
[EMAIL PROTECTED]


This works for about 50% of mail, but I think one serious problem is 
that the line:


header  __NICK_BOUNCE_REAL  To =~ /[EMAIL PROTECTED]/i

...matches on the header:

X-MDaemon-Deliver-To: [EMAIL PROTECTED]

Which I'm pretty sure it shouldn't! Why does it think that header is the 
same as a normal To header? Surely it's not scanning for headers simply 
ending in To?


My rules are below for comment/improvement but please let me know if 
there's a better way to do this or an existing set of working rules 
somewhere.


Nick...


# -- BOUNCE DETECTION (stolen from
# bogus_virus_warnings.cf)-
# General rule to indicate bounce or otherwise - used for some other
# rules
header __BOUNCE_HEADER  X-Is-A-Bounce =~ /.+/

# This won't match for scanning done at SMTP time, at least with Exim
header __BOUNCE_RP1 Return-Path =~  /^$/

# NL says this is added by amavisd-new before passing to SA
header __BOUNCE_RP2 X-Return-Path =~ /^$/

# Mark Martinec says the above is incorrect, and it's X-Envelope-From
header __BOUNCE_RP3 X-Envelope-From =~ /^$/

meta __NULL_SENDER  __BOUNCE_HEADER || __BOUNCE_RP1 || 
__BOUNCE_RP2 || __BOUNCE_RP3


# Thanks to AF
header __CT_DEL_STATUS  Content-Type =~ 
/report-type=delivery-status/


meta __NICK_IS_A_BOUNCE __NULL_SENDER || __CT_DEL_STATUS

header  __NICK_BOUNCE_REAL  To =~ /[EMAIL PROTECTED]/i
header  __NICK_TO_NOT_METo =~ /[EMAIL PROTECTED]/i

meta NICK_SPOOF_BOUNCE (( __NICK_IS_A_BOUNCE  __NICK_TO_NOT_ME)  
(!__NICK_BOUNCE_REAL))

score NICK_SPOOF_BOUNCE 10
describe  NICK_SPOOF_BOUNCE Attached bounce contains my address but I 
never sent this!


--


Nick Gilbert, Software Developer
X-RM Limited, Winchester, UK
W: http://www.x-rm.com/
E: [EMAIL PROTECTED]
T: 01962 877237
F: 01962 842346




Re: Rules to reject bounce messages for mail not sent by me

2006-10-27 Thread Nick Gilbert
PS. Will setting up SPF on my domain name have any effect for things 
like this? Will it discourage spammers from using my domain or reduce 
the number of bounce messages I/we get?


Nick...

Nick Gilbert wrote:

Justin Mason wrote:

existing set: http://wiki.apache.org/spamassassin/VBounceRuleset
;)


Thanks!

One thing I'm not sure about - that module produces two rules. How 
should I score the rules so that real bounces aren't rejected but the 
fake ones are?


I presume I do it this way round:

score BOUNCE_MESSAGE  10
score ANY_BOUNCE_MESSAGE 0.1

I presume BOUNGE_MESSAGE is only true if the bounce is for a mail not 
sent by me? If so, I'm surprised the rule name isn't 
SPOOF_BOUNCE_MESSAGE or similar.


My mail server rejects messages with spam scores of 10 or above.

Nick...


Nick Gilbert writes:

Hi,

I've been trying to write some SA rules to reject bounce messages 
which I did not send.


I've made a good start, but some bounce messages still get through 
but I don't understand why.


The theory is that viruses and spammers don't seem to use my full 
e-mail address [EMAIL PROTECTED] but change the username part of it 
and send from an address [EMAIL PROTECTED] I would like to reject 
all bounce messages which have arisen from mail sent from 
[EMAIL PROTECTED] but NOT [EMAIL PROTECTED]


This works for about 50% of mail, but I think one serious problem is 
that the line:


header  __NICK_BOUNCE_REAL  To =~ /[EMAIL PROTECTED]/i

...matches on the header:

X-MDaemon-Deliver-To: [EMAIL PROTECTED]

Which I'm pretty sure it shouldn't! Why does it think that header is 
the same as a normal To header? Surely it's not scanning for headers 
simply ending in To?


My rules are below for comment/improvement but please let me know if 
there's a better way to do this or an existing set of working rules 
somewhere.


Nick...


# -- BOUNCE DETECTION (stolen from
# bogus_virus_warnings.cf)-
# General rule to indicate bounce or otherwise - used for some other
# rules
header __BOUNCE_HEADER  X-Is-A-Bounce =~ /.+/

# This won't match for scanning done at SMTP time, at least with Exim
header __BOUNCE_RP1 Return-Path =~  /^$/

# NL says this is added by amavisd-new before passing to SA
header __BOUNCE_RP2 X-Return-Path =~ /^$/

# Mark Martinec says the above is incorrect, and it's X-Envelope-From
header __BOUNCE_RP3 X-Envelope-From =~ /^$/

meta __NULL_SENDER  __BOUNCE_HEADER || __BOUNCE_RP1 || 
__BOUNCE_RP2 || __BOUNCE_RP3


# Thanks to AF
header __CT_DEL_STATUS  Content-Type =~ 
/report-type=delivery-status/


meta __NICK_IS_A_BOUNCE __NULL_SENDER || __CT_DEL_STATUS

header  __NICK_BOUNCE_REAL  To =~ /[EMAIL PROTECTED]/i
header  __NICK_TO_NOT_METo =~ /[EMAIL PROTECTED]/i

meta NICK_SPOOF_BOUNCE (( __NICK_IS_A_BOUNCE  __NICK_TO_NOT_ME)  
(!__NICK_BOUNCE_REAL))

score NICK_SPOOF_BOUNCE 10
describe  NICK_SPOOF_BOUNCE Attached bounce contains my address but 
I never sent this!




--


Nick Gilbert, Software Developer
X-RM Limited, Winchester, UK
W: http://www.x-rm.com/
E: [EMAIL PROTECTED]
T: 01962 877237
F: 01962 842346




using --add-to-blacklist feature of spamassassin

2006-10-27 Thread ankush grover

hey friends,

I am using SA 3.1.3 on FC3 with Postfix. I tried the
--add-to-blacklist feature of spamassassin.

spamassassin --add-to-blacklist /home/testing/Maildir/.spam/cur/
SpamAssassin auto-whitelist: adding address to blacklist: [EMAIL PROTECTED]


Is this right way to use this command and can somebody tell me the
path of the blacklist file where these names are getting added ?

I ran the above command as root and in the .spamassassin directory
these files are there

auto-whitelist  bayes_seen  bayes_toks  user_prefs

but I am not able to find the blacklist file and when I ran this
command spamassassin --lint  I got the below error

[17284] warn: auto-whitelist: open of auto-whitelist file failed:
auto-whitelist: cannot open auto_whitelist_path
/root/.spamassassin/auto-whitelist: Inappropriate ioctl for device

Why this error is coming and what should I do to get rid of this error ?


Please let me now if you need any further inputs.

Thanks  Regards

Ankush Grover


Re: High CPU running SA in a VMware VM

2006-10-27 Thread d.hill

On Thu, 26 Oct 2006 21:48:22 -0700
 Gary W. Smith [EMAIL PROTECTED] wrote:

Did you pre-allocate the disk space? If not you
might consider do that first and defragging the disk.


Good point! I forgot about the disk space.


How to test new plugins

2006-10-27 Thread Patrick Sherrill

How can you test new plugins?

[EMAIL PROTECTED]
CocoNet Corporation
SW Florida's First ISP
825 SE 47th Terrace
Cape Coral, FL 33904
(239) 540-2626 Voice




Re: Per Domain Whitelisting

2006-10-27 Thread Peter H. Lemieux

jasonegli wrote:

For example let's say that domain xyz.com wants to allow all messages from
yahoo.com, but domain 123.com does not. Is there a way to allow FROM
[EMAIL PROTECTED] TO [EMAIL PROTECTED]?


Obtuse SMTPD (http://sd.inodes.org/) can handle this at the SMTP level. 
I think it may be possible to add this to MailScanner 
(http://www.mailscanner.info/) through it's custom rules; its default 
whitelists/blacklists, however, are global.





Re: spamassassin --lint fails with rules in local.cf

2006-10-27 Thread Matt Kettler
Dylan Bouterse wrote:


 **

 [EMAIL PROTECTED] spamassassin]# pwd

 /usr/share/spamassassin

 [EMAIL PROTECTED] spamassassin]# grep SARE_GIF_ATTACH *

 70_sare_stocks.cf:full SARE_GIF_ATTACH  
 /name=\?[0-9a-z._\-]{3,18}\.gif\?/i

 70_sare_stocks.cf:describe SARE_GIF_ATTACH   Email has a inline gif

 70_sare_stocks.cf:scoreSARE_GIF_ATTACH   0.75

 [EMAIL PROTECTED] spamassassin]# grep SARE_GIF_STOX *

 70_sare_stocks.cf:describe SARE_GIF_STOX Inline Gif with little HTML

 70_sare_stocks.cf:scoreSARE_GIF_STOX 1.66

 [EMAIL PROTECTED] spamassassin]# grep SARE_SPEC_XXGEOCITIES2 *

 70_sare_specific.cf:meta  SARE_SPEC_XXGEOCITIES2  
 !__SARE_SPEC_XXGEOCITIE   __SARE_SPEC_XX2GEOCIT

 70_sare_specific.cf:describe  SARE_SPEC_XXGEOCITIES2   spamsign
 pointing to free webhost spam site

 70_sare_specific.cf:score SARE_SPEC_XXGEOCITIES2   1.666

 [EMAIL PROTECTED] spamassassin]# grep SARE_SPEC_XXGEOCITIES3 *

 70_sare_specific.cf:meta  SARE_SPEC_XXGEOCITIES3  
 __SARE_SPEC_XXGEOCITIE__SARE_SPEC_XX2GEOCIT

 70_sare_specific.cf:describe  SARE_SPEC_XXGEOCITIES3   spamsign
 pointing to free webhost spam site

 70_sare_specific.cf:score SARE_SPEC_XXGEOCITIES3   1.666

 *My guess is that the lint check is reading the local.cf file before
 the additional SARE rule sets.** My --list reads:*

 [16109] dbg: config: using /etc/mail/spamassassin for site rules pre
 files

 [16109] dbg: config: read file /etc/mail/spamassassin/init.pre

 [16109] dbg: config: read file /etc/mail/spamassassin/v310.pre

 [16109] dbg: config: read file /etc/mail/spamassassin/v312.pre

 [16109] dbg: config: using /var/lib/spamassassin/3.001003 for sys
 rules pre files

 [16109] dbg: config: using /var/lib/spamassassin/3.001003 for
 default rules dir

 [16109] dbg: config: read file
 /var/lib/spamassassin/3.001003/updates_spamassassin_org.cf

 [16109] dbg: config: using /etc/mail/spamassassin for site rules dir

 [16109] dbg: config: read file /etc/mail/spamassassin/local.cf

 *And the SARE ruleset configs come after that. My SARE rulesets are in
 /usr/share/spamassassin. Should I put my local.cf file there as well
 or am I** going down** the wrong path?*

You're using the wrong path. Move your SARE rules to
/etc/mail/spamassassin/ where they belong.

The SARE rulesets must be parsed BEFORE your local.cf.

Also, are you sure the ones in /usr/share/spamassassin are even being
parsed? According to the above, your system is using
/var/lib/spamassassin/3.001003 instead of /usr/share/spamassassin.

That said, in general, don't monkey with anything but the site rules
dir. Any other rule directories, such as the default rules dir, are
for SA's own rules, and the SA installer feels perfectly free to rm -f *
on those directories.





Re: Where is the latest Imageinfo?

2006-10-27 Thread Jeff Chan
Not sure if it's the latest, but a reference is:

  http://www.rulesemporium.com/plugins.htm#imageinfo

Jeff C.
-- 
Jeff Chan
mailto:[EMAIL PROTECTED]
http://www.surbl.org/



Problems with header rewrite

2006-10-27 Thread Hans München
Hi,

hope someone can help me with the header rewrite.
I'm user FC6, SA 3.1.4 and Evolution as MUA.

My local.cf looks like that:

# SpamAssassin config file for version 3.x
# NOTE: NOT COMPATIBLE WITH VERSIONS 2.5 or 2.6
# See http://www.yrex.com/spam/spamconfig25.php for earlier versions
# Generated by http://www.yrex.com/spam/spamconfig.php (version 1.50)

# How many hits before a message is considered spam.
required_score   5.0

# Change the subject of suspected spam
rewrite_header subject *SPAM*

# Encapsulate spam in an attachment (0=no, 1=yes, 2=safe)
report_safe 1

# Enable the Bayes system
use_bayes   1

# Enable Bayes auto-learning
bayes_auto_learn  1

# Enable or disable network checks
skip_rbl_checks 1
use_razor2  1
use_dcc 1
use_pyzor   1

# Mail using languages used in these country codes will not be marked
# as being possibly spam in a foreign language.
ok_languagesall

# Mail using locales used in these country codes will not be marked
# as being possibly spam in a foreign language.
ok_locales  all

chmod is 644.

But  when  I send me an GTUBE mail, the header don't will be rewritten
and also he subject don't will be changed.

Does someone has any idea, why the header will be not changed?

P.S. Sorry for my maybe badly english.

-- 
Greetings out of Munich
Hans



Re: Problems with header rewrite

2006-10-27 Thread Matt Kettler
Hans München wrote:
 Hi,

 hope someone can help me with the header rewrite.
 I'm user FC6, SA 3.1.4 and Evolution as MUA.

 My local.cf looks like that:


   
snip
 chmod is 644.

 But  when  I send me an GTUBE mail, the header don't will be rewritten
 and also he subject don't will be changed.

 Does someone has any idea, why the header will be not changed?
   

Where have you integrated SA into your mail processing? It sounds like
SA isn't even being called.

Did you configure Evolution to feed the mail to SA?

SA isn't actually called when you just install it, you have to
explicitly configure something to call it. There's dozens of different
ways to do this, so this can't just happen automatically when you
install. The installer wouldn't know where you wanted SA inserted :)

Here's one web article showing how to get evolution to pipe mail into SA:

http://www.atlantawebhost.com/articles/evolution_spamassassin.php

(note: I don't use Evolution, so I can't attest to the accuracy.
However, this looks correct)






RE: spamassassin --lint fails with rules in local.cf (now perl plugin error for TextCat)

2006-10-27 Thread Dylan Bouterse

-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED] 
Sent: Friday, October 27, 2006 9:13 AM
To: Dylan Bouterse
Cc: users@spamassassin.apache.org
Subject: Re: spamassassin --lint fails with rules in local.cf

Dylan Bouterse wrote:


 **

 [EMAIL PROTECTED] spamassassin]# pwd

 /usr/share/spamassassin

 [EMAIL PROTECTED] spamassassin]# grep SARE_GIF_ATTACH *

 70_sare_stocks.cf:full SARE_GIF_ATTACH  
 /name=\?[0-9a-z._\-]{3,18}\.gif\?/i

 70_sare_stocks.cf:describe SARE_GIF_ATTACH   Email has a inline gif

 70_sare_stocks.cf:scoreSARE_GIF_ATTACH   0.75

 [EMAIL PROTECTED] spamassassin]# grep SARE_GIF_STOX *

 70_sare_stocks.cf:describe SARE_GIF_STOX Inline Gif with little
HTML

 70_sare_stocks.cf:scoreSARE_GIF_STOX 1.66

 [EMAIL PROTECTED] spamassassin]# grep SARE_SPEC_XXGEOCITIES2 *

 70_sare_specific.cf:meta  SARE_SPEC_XXGEOCITIES2  
 !__SARE_SPEC_XXGEOCITIE   __SARE_SPEC_XX2GEOCIT

 70_sare_specific.cf:describe  SARE_SPEC_XXGEOCITIES2   spamsign
 pointing to free webhost spam site

 70_sare_specific.cf:score SARE_SPEC_XXGEOCITIES2   1.666

 [EMAIL PROTECTED] spamassassin]# grep SARE_SPEC_XXGEOCITIES3 *

 70_sare_specific.cf:meta  SARE_SPEC_XXGEOCITIES3  
 __SARE_SPEC_XXGEOCITIE__SARE_SPEC_XX2GEOCIT

 70_sare_specific.cf:describe  SARE_SPEC_XXGEOCITIES3   spamsign
 pointing to free webhost spam site

 70_sare_specific.cf:score SARE_SPEC_XXGEOCITIES3   1.666

 *My guess is that the lint check is reading the local.cf file before
 the additional SARE rule sets.** My --list reads:*

 [16109] dbg: config: using /etc/mail/spamassassin for site rules pre
 files

 [16109] dbg: config: read file /etc/mail/spamassassin/init.pre

 [16109] dbg: config: read file /etc/mail/spamassassin/v310.pre

 [16109] dbg: config: read file /etc/mail/spamassassin/v312.pre

 [16109] dbg: config: using /var/lib/spamassassin/3.001003 for sys
 rules pre files

 [16109] dbg: config: using /var/lib/spamassassin/3.001003 for
 default rules dir

 [16109] dbg: config: read file
 /var/lib/spamassassin/3.001003/updates_spamassassin_org.cf

 [16109] dbg: config: using /etc/mail/spamassassin for site rules dir

 [16109] dbg: config: read file /etc/mail/spamassassin/local.cf

 *And the SARE ruleset configs come after that. My SARE rulesets are in
 /usr/share/spamassassin. Should I put my local.cf file there as well
 or am I** going down** the wrong path?*

You're using the wrong path. Move your SARE rules to
/etc/mail/spamassassin/ where they belong.

The SARE rulesets must be parsed BEFORE your local.cf.

Also, are you sure the ones in /usr/share/spamassassin are even being
parsed? According to the above, your system is using
/var/lib/spamassassin/3.001003 instead of /usr/share/spamassassin.

That said, in general, don't monkey with anything but the site rules
dir. Any other rule directories, such as the default rules dir, are
for SA's own rules, and the SA installer feels perfectly free to rm -f *
on those directories.



Amavisd read the /usr/share/spamassassin dir which is probably why
--lint didn't work but reloading amavisd would work. Either way.

I moved my /usr/share/spamassassin dir contents to
/etc/mail/spamassassin. I get the following errors when trying to
--lint. 

[3246] dbg: plugin: loading Mail::SpamAssassin::Plugin::TextCat from
@INC
[3246] warn: textcat: languages filename not defined
[3246] dbg: plugin: registered
Mail::SpamAssassin::Plugin::TextCat=HASH(0x9760db8)

[3246] warn: config: invalid regexp for rule SUBJ_SOMEONE_WROTE: Subject
=~ /\bwrote:$/i: missing or invalid delimiters
[3246] warn: config: warning: description exists for non-existent rule
SUBJ_SOMEONE_WROTE

[3246] warn: config: warning: score set for non-existent rule
SUBJ_SOMEONE_WROTE

[3246] warn: Use of uninitialized value in hash element at
/usr/lib/perl5/site_perl/5.8.5/Mail/SpamAssassin/Plugin/TextCat.pm line
380.
[3246] warn: Use of uninitialized value in join or string at
/usr/lib/perl5/site_perl/5.8.5/Mail/SpamAssassin/Plugin/TextCat.pm line
391.
[3246] dbg: textcat: language possibly:
[3246] warn: Use of uninitialized value in join or string at
/usr/lib/perl5/site_perl/5.8.5/Mail/SpamAssassin/Plugin/TextCat.pm line
469.

The SUBJ_SOMEONE_WROTE rule that was posted a week ago or so on the list
isn't passing.
20_phrases.cf:body SUBJ_SOMEONE_WROTE   Subject =~ /\bwrote:$/i
20_phrases.cf:describe SUBJ_SOMEONE_WROTE   Search for Subject lines
ending in wrote:
50_scores.cf:score SUBJ_SOMEONE_WROTE 3.000

I still get the TextCat errors even if I comment out the
SUBJ_SOMEONE_WROTE rule.

Dylan


[OT] Filter Server Specs

2006-10-27 Thread Duane Hill
Currently, we are looking to install a server that will be doing content 
filtering for our main e-mail server. I thought I would toss this out to 
everyone to get some feedback on if the server would be adequate.


The server is a Dell PowerEdge 6850 with the following:

 - Four 2.6 GHz/800Mhz/4mb Cache Dual-Core Intel Zeon 7110M processors
 - Eight GB DDR2 400Mhz ram
 - Four 300GB, 3Gbps, SAS, 10K RPM Hard Drives running Raid-5 on a 
PERC5/i controller


Our main e-mail server services over 500 domains with an account total 
of around 40,000.


The current filter server we have can not do any content filtering 
outside of itself (i.e. the MTA) because of CPU load (i.e. 
SpamAssassin). Any message scanning where the message size is over 1.5K 
will kill the CPU. The current filter server we have in place is 
rejecting an average 2.4 million per day with just the common 
blacklisting and some other things that are set in place.


The other thing I would like to know is what kind of an operating system 
would one install on this new server?


Again, I appreciate any feedback that can be said.


Re: How to test new plugins

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 08:08:36AM -0400, Patrick Sherrill wrote:
 How can you test new plugins?

Load the plugin and include any associated configs, then see what happens.

(the question is extremely vague, so this answer is probably not very useful.)

-- 
Randomly Selected Tagline:
What the hell is this?  For crying out loud, somebody throw a pie!
 - Peter Griffin on Family Guy


pgpPVrzzsjYRk.pgp
Description: PGP signature


RE: High CPU running SA in a VMware VM

2006-10-27 Thread Sammy Anderson
The I/O rate is pretty low. The files going through expiration are  only about 5 MB, and it only takes one of these to drive the CPU up. I  think there are over 100,000 tokens in the file, each with a timestamp,  and I believe there must be some sorting going on, so I suspect that is  where the issue is.Thanks,Ian"Gary W. Smith" [EMAIL PROTECTED] wrote:What does the IO usage look like on the  server? We ran a couple of our backup SA instances on VMWare but they  database is on a remote SQL server. So the only IO is logging. We have  several VM Instances for a variety of things. Did you pre-allocate the  disk space? If not you might consider do that first and defragging the  disk.From: Sammy Anderson  [mailto:[EMAIL PROTECTED]   Sent: Thursday, October 26, 2006  3:52 PM  To: users@spamassassin.apache.org  Subject: High CPU running SA in a  VMware VMWe recently migrated our SpamAssassin installation from a physical 3.6  GHz system running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4  as the guest OS and SA 3.1.7. Each user has their own Bayes files  (Berkeley DB) and these were copied from the old to the new server. Now  whenever an expiry process runs on a user's database, the CPU spikes, sometimes  for a minute or longer. We did not notice spikes on the old server, but  it is really hammering the VM. Has anyone else experienced this problem?  For now I have disabled Bayes altogether because of the unacceptable load.   
 --SA Do you Yahoo!?  Get on board. You're  invited to try the new Yahoo! Mail. 
		Do you Yahoo!? 
Get on board. You're invited to try the new Yahoo! Mail.

Re: High CPU running SA in a VMware VM

2006-10-27 Thread Sammy Anderson
The guest has more memory than it is using, so it isn't doing any paging or swapping.As for the ESX 2.5.4 box, it isn't swapping either. There is currently enough physical RAM for the few VM's running.[EMAIL PROTECTED] wrote:  On Thu, 26 Oct 2006 15:52:17 -0700 (PDT)  Sammy Anderson  wrote:We recently migrated our SpamAssassin installation from a physical 3.6  GHz system running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with  RHEL 4 as the guest OS and SA 3.1.7.  Each user has their own  Bayes files (Berkeley DB) and these were copied from the old to the new  server.  Now whenever an expiry process runs on a user's database,  the CPU spikes, sometimes for a minute or longer.  We did not  notice spikes on the old
 server, but it is really hammering the  VM.  Has anyone else experienced this problem?  For now I  have disabled Bayes altogether because of the unacceptable load.Perhaps memory started to spill into the swap on either the VM or guest OS.I don't know what version of VMWare you are using. I'm using v5.2.2 runningunder Windows. In the memory preferences I have mine set so all the virtualmachine memory has to fit into the reserved host ram. I've done small testswith SA before and haven't had any problems. Then again, I haven't foundanything I can use to put a load on a test install. My test bed is on aduo-core 3.2ghz with four gig of ram. The VM has a full gig of ram allocatedand is running the release version of FreeBSD 6.1. __Do You Yahoo!?Tired of spam?  Yahoo! Mail has the best
 spam protection around http://mail.yahoo.com 

Re: mcafee-spamassassin-rules

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 12:25:53PM +0200, Johann Spies wrote:
 just as well try and use those rules.  However, they were written for
 version 2.6 and 3.0.3-2sarge1 is complaining about those rules.

My recollection is that they're using a pre-3.0 version of SA, with (I'd
imagine) a number of modifications.

 Is there a way to utilize their updates with the later versions of
 spamassassin?  Or do I have to use there version of spamassassin to do
 so?  Would that be advisable?

It's hard to say since they could have modified their SA in any number
of ways.  You'd want to go through the config line by line and see what
can be used directly, what could be used with modification, and what
can't be used because it requires proprietary changes.  It's also worth
keeping in mind that spam detection isn't just about rules, it's also about
the engine, so just because rules work well with their code doesn't mean it'll
work well on the standard code.

It's also worth noting that hypothetically, if I was a company releasing
updates based on an open-source product, I may have incentive to avoid
making those updates useful on said product, otherwise people would
download my updates and not pay me for the software.

-- 
Randomly Selected Tagline:
the real ttys became pseudo ttys and vice-versa. - Today's BOFH Excuse


pgpMs1tMea3I6.pgp
Description: PGP signature


Re: what's the matter here? Text::Wrap

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 05:15:57PM +0800, Xueron Nee wrote:
 When I use CPAN to upgade my SA from 3.1.4 to current version, it prints
 many warnings like these:
 
 t/rcvd_parser...ok 40/53(?:(?=[\s,]))* matches null string many 
 times in regex; marked by -- HERE in m/\G(?:(?=[\s,]))* -- HERE \Z/ at 
 /usr/lib/perl5/5.8.5/Text/Wrap.pm line 46.
[...]
 
 Seems there is something wrong with Text::Wrap.

Yep.

 # perl -MText::Wrap -e 'print $Text::Wrap::VERSION;'
 2006.0711
 
 cpan install Text::Wrap
 Text::Wrap is up to date.

Yeah, you need to downgrade since they haven't fixed this bug yet.

See http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5056 for more info.

-- 
Randomly Selected Tagline:
I like work; it fascinates me; I can sit and look at it funny...


pgp8W7a5JYlMx.pgp
Description: PGP signature


'spamassassin --revoke' and 'razor-revoke' are interchangeable?

2006-10-27 Thread Leon Kolchinsky
Hello all,

Could someone tell me if 'spamassassin --revoke' and 'razor-revoke' are 
interchangeable?

What exactly happening when I revoke the 'false negative' message? 
Its details reported to razor2 DB and BAYESIAN DB as ham? 
Are these messages being resend to the original recipients?


Can I use the following syntax on my Cyrus system?:
spamassassin --revoke /ham_folder/*
or
/usr/lib/razor-revoke /ham_folder/*
sa-learn --showdots --ham /ham_folder/*



Regards,
Leon Kolchinsky



Re: Scoring base64 blob messages

2006-10-27 Thread Theo Van Dinter
On Thu, Oct 26, 2006 at 12:19:23PM -0400, Peter H. Lemieux wrote:
 No, because there are going to be a lot of mails that would hit that.
 
 Really?  Maybe it's because I live in the US, but I can't think of a 
 legitimate message I've ever received consisting only of a base64 blob. 

You look at a lot of raw messages?  ;)

 Our of curiosity, how frequently does this appear in the SA ham corpus? 

Well, there isn't a SA corpus, so there's no answer to that question.  As
for how often it happens in my corpus, I don't know I'd have to write a rule
and run it against the messages.

 Rather than making anyone else do the work for me, is there something I 
 can read about how to determine the frequency of different message 
 features appearing in the corpus?

You can generate some rules and use mass-check to run against your own corpus
to gather some statistics.  I'm willing to run some rules for you against my
corpus if you want.  I just don't have time to come up with the rules right
now.

-- 
Randomly Selected Tagline:
strrev(strcpy(xus yti +7,varg)-7)[0]='G'


pgpF2Hq77D2uV.pgp
Description: PGP signature


Re: Scoring base64 blob messages

2006-10-27 Thread Stuart Johnston

Peter H. Lemieux wrote:

Theo Van Dinter wrote:

On Thu, Oct 26, 2006 at 09:46:28AM -0400, Peter H. Lemieux wrote:
Also is there an SA rule that scores messages that contain only a 
single base64 part (as opposed to a base64-encoded attachment)?  I 
doubt many legitimate messages arrive with only a single base64 part.


No, because there are going to be a lot of mails that would hit that.


Really?  Maybe it's because I live in the US, but I can't think of a 
legitimate message I've ever received consisting only of a base64 blob. 
Our of curiosity, how frequently does this appear in the SA ham corpus? 
Rather than making anyone else do the work for me, is there something I 
can read about how to determine the frequency of different message 
features appearing in the corpus?


Most messages sent from a Blackberry would hit this rule, for example.


RE: I'm thinking about suing Microsoft

2006-10-27 Thread Michael Beckmann
I think there is a problem where a version of XP downloads the security 
patches automatically, but does not install them. This does not lead to 
increased security, because most users are gnorant of security patches and 
would never install them manually.


Michael

--On Montag, 23. Oktober 2006 16:46 -0400 Rose, Bobby 
[EMAIL PROTECTED] wrote:




But windows patches are free.  Even if you are using an illegal copy of
windows, you can still manually download and install the patches.  It's
Microsoft Update where they mostly have the genuine windows verification
code.  Even Redhat forces you to pay subscriptions for their autoupdate
management stuff.

-Original Message-
From: Marc Perkel [mailto:[EMAIL PROTECTED]
Sent: Monday, October 23, 2006 3:59 PM
To: Jo
Cc: Duane Hill; users@spamassassin.apache.org
Subject: Re: I'm thinking about suing Microsoft



Popularity is a factor. But the real vulnerability is that Windows can
be more secure if it has the patches. If Linux for example restricted
it's seurity patches to only licensed users they would have the same
problem. I'm not saying either that MS should be compelled to distribute
any upgrades for free. Just secutiry fixes.





Re: Per Domain Whitelisting

2006-10-27 Thread Daryl C. W. O'Shea

Roman Sozinov wrote:



Peter H. Lemieux wrote:

jasonegli wrote:

For example let's say that domain xyz.com wants to allow all messages
from
yahoo.com, but domain 123.com does not. Is there a way to allow FROM
[EMAIL PROTECTED] TO [EMAIL PROTECTED]?
Obtuse SMTPD (http://sd.inodes.org/) can handle this at the SMTP level. 
I think it may be possible to add this to MailScanner 
(http://www.mailscanner.info/) through it's custom rules; its default 
whitelists/blacklists, however, are global.




What about spamassassin? Does it have possibility Per Domain Whitelisting?


Of course it does.  It supports per user preferences, so if you pass 
nothing but domain names it thus supports per domain preferences.


Daryl


Re: How to test new plugins

2006-10-27 Thread Patrick Sherrill
I guess what I'm looking for is a way to test the plug-ins/configuration 
against a separate instance of sa that would read the new cfs without 
restarting existing daemons (we're using amavis-new).

Pat...

- Original Message - 
From: Theo Van Dinter [EMAIL PROTECTED]

To: users@spamassassin.apache.org
Sent: Friday, October 27, 2006 11:00 AM
Subject: Re: How to test new plugins




Re: Scoring base64 blob messages

2006-10-27 Thread Daryl C. W. O'Shea

Peter H. Lemieux wrote:

Theo Van Dinter wrote:

On Thu, Oct 26, 2006 at 09:46:28AM -0400, Peter H. Lemieux wrote:


Also is there an SA rule that scores messages that contain only a 
single base64 part (as opposed to a base64-encoded attachment)?  I 
doubt many legitimate messages arrive with only a single base64 part.


No, because there are going to be a lot of mails that would hit that.


Really?  Maybe it's because I live in the US, but I can't think of a 
legitimate message I've ever received consisting only of a base64 blob.


I'm not sure what to say to that. ;)


Our of curiosity, how frequently does this appear in the SA ham corpus? 


Ticketmaster sends out *a lot* of their mail this way.  I'm sure it's 
partly in an attempt to avoid having their mail FP against crappy filters.



Daryl


Re: How to test new plugins

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 12:40:57PM -0400, Patrick Sherrill wrote:
 I guess what I'm looking for is a way to test the plug-ins/configuration 
 against a separate instance of sa that would read the new cfs without 
 restarting existing daemons (we're using amavis-new).

You can copy the /etc/mail/spamassassin directory to somewhere else,
then change the pre and cf files in that dir.  Then you can test
spamassassin/spamd/etc with the --siteconfigpath option to override its
default value.  :)

(for spamd, if you already have a running copy at port 783, you'd want to run
it and spamc via a different port, of course.)

-- 
Randomly Selected Tagline:
linux: because a PC is a terrible thing to waste
 ([EMAIL PROTECTED] put this on Tshirts in '93)


pgpBiW2Klr9zW.pgp
Description: PGP signature


Re: Scoring base64 blob messages

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 11:44:48AM -0400, Daryl C. W. O'Shea wrote:
 Ticketmaster sends out *a lot* of their mail this way.  I'm sure it's 
 partly in an attempt to avoid having their mail FP against crappy filters.

I'd also imagine that sometimes it's just easier to do this than try to pay
attention to what is being sent and determine if encoding is necessary.
Programmers tend to be lazy after all. :)

-- 
Randomly Selected Tagline:
There are two major products to come out of Berkeley: LSD and UNIX.  We
 don't believe this to be a coincidence.  - Unknown


pgpFdvR1uEW9A.pgp
Description: PGP signature


Re: spamd scan problem

2006-10-27 Thread Peter Teunissen


On 27-okt-2006, at 11:40, Frank van den Diepstraten wrote:

ok I understand that, but I wan't to know if this causes the  
problem. So I
want to trie it out without that razor thing... But I can't find  
the config

where it's enabled in.



Hi Frank,


To disable razor, add the following to your local.cf:

use_razor2  0

Peter


-Oorspronkelijk bericht-
Van: John Andersen [mailto:[EMAIL PROTECTED]
Verzonden: vrijdag 27 oktober 2006 11:36
Aan: users@spamassassin.apache.org
Onderwerp: Re: FW: spamd scan problem


On Friday 27 October 2006 01:32, Frank van den Diepstraten wrote:

But now the question is where I can
disable this razor thing...


No no, you want to ENABLE it on the good system.

Razor is wounderfull.  It just takes a little bit of time, but not
a great deal of CPU load.

Razor catches a lot of spam with almost a non-existant
false positive rate.

--
_
John Andersen





RE: mcafee-spamassassin-rules

2006-10-27 Thread Chris Santerre
Title: RE: mcafee-spamassassin-rules






 It's also worth noting that hypothetically, if I was a 
 company releasing
 updates based on an open-source product, I may have incentive to avoid
 making those updates useful on said product, otherwise people would
 download my updates and not pay me for the software.


Wouldn't that be against the open source lic? 


I'm sure they don't use open source rules either. *giggle*


--Chris 





Re: I'm thinking about suing Microsoft

2006-10-27 Thread Jay Chandler
You have to explicitly choose that option.  Are you suggesting we shouldn't be able to choose that?  I'm not a big fan of trusting MS patches, as they tend to break things periodically...On Oct 27, 2006, at 8:47 AM, Michael Beckmann wrote:I think there is a problem where a version of XP downloads the security patches automatically, but does not install them. This does not lead to increased security, because most users are gnorant of security patches and would never install them manually.Michael--On Montag, 23. Oktober 2006 16:46 -0400 "Rose, Bobby" [EMAIL PROTECTED] wrote: But windows patches are free.  Even if you are using an illegal copy ofwindows, you can still manually download and install the patches.  It'sMicrosoft Update where they mostly have the genuine windows verificationcode.  Even Redhat forces you to pay subscriptions for their autoupdatemanagement stuff.-Original Message-From: Marc Perkel [mailto:[EMAIL PROTECTED]]Sent: Monday, October 23, 2006 3:59 PMTo: JoCc: Duane Hill; users@spamassassin.apache.orgSubject: Re: I'm thinking about suing MicrosoftPopularity is a factor. But the real vulnerability is that Windows canbe more secure if it has the patches. If Linux for example restrictedit's seurity patches to only licensed users they would have the sameproblem. I'm not saying either that MS should be compelled to distributeany upgrades for free. Just secutiry fixes.   -- Jay ChandlerNetwork Administrator, Chapman University714-628-7249 / [EMAIL PROTECTED]"Bother," said Pooh as he struggled with /etc/sendmail.cf, "it never does quite what I want.  I wish Christopher Robin was here." -- Peter Da Silva in a.s.r. 

RE: High CPU running SA in a VMware VM

2006-10-27 Thread Ring, John C
From: Sammy Anderson [mailto:[EMAIL PROTECTED] 

We recently migrated our SpamAssassin installation from a physical 3.6
GHz system
running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as
the guest OS
and SA 3.1.7.

I just did the same thing last week, except we're using RHEL 3 and ESX
2.5.2, and the physical box it used to be on was far less powerful then
yours.

Each user has their own Bayes files (Berkeley DB) and these were copied
from the old to
the new server.  Now whenever an expiry process runs on a user's
database, the CPU
spikes, sometimes for a minute or longer.

Hmm.  We're using ours as a site-wide MTA to be able to reject incoming
mails at SMTP time, so no user DBs on the box, but we are running with
Bayes checking on (Berkeley DB), autolearning off, and manual Bayes
feeding only a few times a day.  Because of that, I don't have practice
with a heavy Bayes load, but how certain are you that it's Bayes hitting
the CPU; did you run sa-learn (or spamassassin) with network reporting
turned off to see if that makes a difference?

I ask because pyzor did keep our CPU at a constant 75% until I turned it
off; now it varies from 25% to 75% over the day, which is a lot more
acceptable :)

Another thought, albeit perhaps not directly related, is are you running
spamd with --robin-robin?  When I did that, it reduced the CPU load with
the trade-off of using a little more memory, which seems to be the
better trade-off, especially for a VM on ESX.

-- 
John C. Ring, Jr. 
[EMAIL PROTECTED] 
Network Engineer
Union Switch  Signal Inc.

If men were angels, no government would be necessary. If angels were to
govern men, neither external nor internal controls on government would
be necessary. -- James Madison


Re: ImageInfo vs FuzzyOCR performance?

2006-10-27 Thread Kenneth Porter
--On Friday, October 27, 2006 6:29 AM -0700 Jeff Chan [EMAIL PROTECTED] 
wrote:



Does anyone have any recent feedback about the performance of
ImageInfo versus FuzzyOCR about detecting stock image spams (or
any others)?  Does FuzzyOCR catch significantly more spams than
ImageInfo?


The last I checked, ImageInfo simply reads some header info from the image. 
It's pretty lightweight, probably more so than any Perl-based regex in SA. 
FuzzyOCR is much more compute-intensive, since it has to perform image 
processing (through gocr, as well as conversions necessary to get the input 
into the format that gocr expects).





Re: [OT] Filter Server Specs

2006-10-27 Thread Clifton Royston
On Fri, Oct 27, 2006 at 02:42:49PM +, Duane Hill wrote:
 Currently, we are looking to install a server that will be doing content 
 filtering for our main e-mail server. I thought I would toss this out to 
 everyone to get some feedback on if the server would be adequate.
 
 The server is a Dell PowerEdge 6850 with the following:
 
  - Four 2.6 GHz/800Mhz/4mb Cache Dual-Core Intel Zeon 7110M processors
  - Eight GB DDR2 400Mhz ram
  - Four 300GB, 3Gbps, SAS, 10K RPM Hard Drives running Raid-5 on a 
 PERC5/i controller
 
 Our main e-mail server services over 500 domains with an account total 
 of around 40,000.
 
 The current filter server we have can not do any content filtering 
 outside of itself (i.e. the MTA) because of CPU load (i.e. 
 SpamAssassin). Any message scanning where the message size is over 1.5K 
 will kill the CPU. The current filter server we have in place is 
 rejecting an average 2.4 million per day with just the common 
 blacklisting and some other things that are set in place.
 
  I *think* this should handle your load.  Personally from my years of
ISP experience, I'd strongly favor going the road of multiple identical
servers in parallel rather than putting all your eggs in one basket. 
E.g. use two 4 CPU servers rather than one 8 CPU (4x dualcore) server.
The difference is that if it comes up just short, or if load jumps up
again, it's easier to add a 3rd server and cut it into the mail path
than to upgrade a server which is handling all your filtering.

  You also don't need fast hard drives on a filtering server; it's
almost all gonna be pushing the CPU and RAM.

 The other thing I would like to know is what kind of an operating system 
 would one install on this new server?

  This'll get you into a religious war for sure...  I would favor
FreeBSD latest (6.x), but any version of Linux with a good package
system and a recent 2.6 kernel is a good choice - maybe better than
FreeBSD at using 8 CPUs.  Reasonable possibilities include CentOS,
Gentoo, Debian.  I'm not a big Linux head, others may have stronger
opinions on that front.

  -- Clifton

-- 
Clifton Royston  --  [EMAIL PROTECTED] / [EMAIL PROTECTED]
   President  - I and I Computing * http://www.iandicomputing.com/
 Custom programming, network design, systems and network consulting services


RE: High CPU running SA in a VMware VM

2006-10-27 Thread Sammy Anderson
I'm pretty sure it is that, because when I turn of bayes altogether,  the spikes go away. I also ran sa-learn --force-expire and it  PEGS the VM. With bayes debugging enabled, I see lines like this  in my syslog:bayes: expired old bayes database entries in 236 seconds: 152268 entries kept, 9457 deletedWe have about 140 users, each with a 5 MB bayes_toks file, so there is  a need to expire somebody all throughout the day. Each user is  virtual, they don't really have an account on the box, but the  directories correspond to each user address. And we do  auto-learn, with opportunistic expiry.Good thought about --round-robin, I am willing to use a little more memory if it saves on CPU."Ring, John C" [EMAIL PROTECTED] wrote:  From: Sammy Anderson
 [mailto:[EMAIL PROTECTED] We recently migrated our SpamAssassin installation from a physical 3.6GHz systemrunning RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 asthe guest OSand SA 3.1.7.I just did the same thing last week, except we're using RHEL 3 and ESX2.5.2, and the physical box it used to be on was far less powerful thenyours.Each user has their own Bayes files (Berkeley DB) and these were copiedfrom the old tothe new server.  Now whenever an expiry process runs on a user'sdatabase, the CPUspikes, sometimes for a minute or longer.Hmm.  We're using ours as a site-wide MTA to be able to reject incomingmails at SMTP time, so no user DBs on the box, but we are running withBayes checking on (Berkeley DB), autolearning off, and manual Bayesfeeding only a few times a day.  Because of that, I don't have practicewith a heavy Bayes load,
 but how certain are you that it's Bayes hittingthe CPU; did you run sa-learn (or spamassassin) with network reportingturned off to see if that makes a difference?I ask because pyzor did keep our CPU at a constant 75% until I turned itoff; now it varies from 25% to 75% over the day, which is a lot moreacceptable :)Another thought, albeit perhaps not directly related, is are you runningspamd with --robin-robin?  When I did that, it reduced the CPU load withthe trade-off of using a little more memory, which seems to be thebetter trade-off, especially for a VM on ESX.-- John C. Ring, Jr. [EMAIL PROTECTED] Network EngineerUnion Switch  Signal Inc."If men were angels, no government would be necessary. If angels were togovern men, neither external nor internal controls on government wouldbe necessary." -- James Madison 
		Do you Yahoo!? Everyone is raving about the  all-new Yahoo! Mail.

Re: High CPU running SA in a VMware VM

2006-10-27 Thread Anders Norrbring
Sorry about top-posting, but I just catched the topic, and found it a 
bit interesting...


I run my SMTP server entirely in a VMware VM, and have *never* seen a 
high CPU usage on that particular machine.  I run Postfix, Amavis-new 
2.4.3, SA 3.1.7 and quite some plug-ins.


Bayes and quarantine are all in a MySQL database stored on another VM, 
no big load there either...
At peaks, I have a 2-4% CPU usage and 20-65% memory usage on eash VM, 
all reported by Virtual Center 1.4.


So, naturally I'm curious about why there would be a high CPU load from 
using SA My guess is that it's something else causing it.


--

Anders Norrbring
Norrbring Consulting

Sammy Anderson skrev:
I'm pretty sure it is that, because when I turn of bayes altogether, the 
spikes go away.  I also ran sa-learn --force-expire and it PEGS the VM.  
With bayes debugging enabled, I see lines like this in my syslog:


bayes: expired old bayes database entries in 236 seconds: 152268 entries 
kept, 9457 deleted


We have about 140 users, each with a 5 MB bayes_toks file, so there is a 
need to expire somebody all throughout the day.  Each user is virtual, 
they don't really have an account on the box, but the directories 
correspond to each user address.  And we do auto-learn, with 
opportunistic expiry.


Good thought about --round-robin, I am willing to use a little more 
memory if it saves on CPU.


*/Ring, John C [EMAIL PROTECTED]/* wrote:

 From: Sammy Anderson [mailto:[EMAIL PROTECTED]
 
 We recently migrated our SpamAssassin installation from a physical 3.6
GHz system
 running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as
the guest OS
 and SA 3.1.7.

I just did the same thing last week, except we're using RHEL 3 and ESX
2.5.2, and the physical box it used to be on was far less powerful then
yours.

 Each user has their own Bayes files (Berkeley DB) and these were
copied
from the old to
 the new server. Now whenever an expiry process runs on a user's
database, the CPU
 spikes, sometimes for a minute or longer.

Hmm. We're using ours as a site-wide MTA to be able to reject incoming
mails at SMTP time, so no user DBs on the box, but we are running with
Bayes checking on (Berkeley DB), autolearning off, and manual Bayes
feeding only a few times a day. Because of that, I don't have practice
with a heavy Bayes load, but how certain are you that it's Bayes hitting
the CPU; did you run sa-learn (or spamassassin) with network reporting
turned off to see if that makes a difference?

I ask because pyzor did keep our CPU at a constant 75% until I turned it
off; now it varies from 25% to 75% over the day, which is a lot more
acceptable :)

Another thought, albeit perhaps not directly related, is are you running
spamd with --robin-robin? When I did that, it reduced the CPU load with
the trade-off of using a little more memory, which seems to be the
better trade-off, especially for a VM on ESX.

-- 
John C. Ring, Jr.

[EMAIL PROTECTED]
Network Engineer
Union Switch  Signal Inc.

If men were angels, no government would be necessary. If angels were to
govern men, neither external nor internal controls on government would
be necessary. -- James Madison



Do you Yahoo!?
Everyone is raving about the all-new Yahoo! Mail. 
http://us.rd.yahoo.com/evt=42297/*http://advision.webevents.yahoo.com/mailbeta 



smime.p7s
Description: S/MIME Cryptographic Signature


URIXBL?

2006-10-27 Thread Jeff Hardy
Hello all,

I've been diddling with some tests and wondered why there is a spamhaus
URIBL_SBL, but not URIBL_XBL (or better yet, combined URIBL_SBL-XBL).  I
can create this myself easy enough, but wondered if there was a reason
XBL is not included.  Thanks.

-Jeff



Re: mcafee-spamassassin-rules

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 01:38:32PM -0400, Chris Santerre wrote:
  It's also worth noting that hypothetically, if I was a 
  company releasing
  updates based on an open-source product, I may have incentive to avoid
  making those updates useful on said product, otherwise people would
  download my updates and not pay me for the software.
 
 Wouldn't that be against the open source lic? 

Not that I'm aware of, why would it be?  If I produce something on my
own (like new rules) and publish it, I'm not bound by someone else's
licensing.  In this case, if I'm following the code license and make
modifications such that new rules that I produce are in a proprietary
format, then that's perfectly valid.  With SA 3, I could even make the
config parsing a plugin and not have to modify any of the base code.

-- 
Randomly Selected Tagline:
I came here to kick butt and chew gum, and I'm all out of gum.
  - They Live (movie)


pgpq3zHGcsyJy.pgp
Description: PGP signature


Re: URIXBL?

2006-10-27 Thread Justin Mason

Jeff Hardy writes:
 Hello all,
 
 I've been diddling with some tests and wondered why there is a spamhaus
 URIBL_SBL, but not URIBL_XBL (or better yet, combined URIBL_SBL-XBL).  I
 can create this myself easy enough, but wondered if there was a reason
 XBL is not included.  Thanks.

Basically, it didn't work well ;)  Try it out -- it doesn't correlate
well with spam.

--j.


Re: URIXBL?

2006-10-27 Thread Stuart Johnston

Jeff Hardy wrote:

Hello all,

I've been diddling with some tests and wondered why there is a spamhaus
URIBL_SBL, but not URIBL_XBL (or better yet, combined URIBL_SBL-XBL).  I
can create this myself easy enough, but wondered if there was a reason
XBL is not included.  Thanks.


XBL is mostly infected PCs.  These systems are used to send spam but not 
generally to host spam domains.


Re: URIXBL?

2006-10-27 Thread Jeff Hardy
On Fri, 2006-10-27 at 20:38 +0100, Justin Mason wrote:
 Jeff Hardy writes:
  Hello all,
  
  I've been diddling with some tests and wondered why there is a spamhaus
  URIBL_SBL, but not URIBL_XBL (or better yet, combined URIBL_SBL-XBL).  I
  can create this myself easy enough, but wondered if there was a reason
  XBL is not included.  Thanks.
 
 Basically, it didn't work well ;)  Try it out -- it doesn't correlate
 well with spam.
 
 --j.

Fair enough I'll test away.  BTW, for anyone else coming across this
post:

 warn: config: error: rule 'URIBL_SBL-XBL' has invalid characters (not
Alphanumeric + Underscore + starting with a non-digit)

Have to get rid of that hyphen.  Thank you 'spamassassin -D all ...'  :)
Thanks for the reply.

-Jeff



Re: MailScanner versus Amavisd-new with postfix

2006-10-27 Thread Martin Hepworth

Jeff Chan wrote:

Not to start any flamewars, but does anyone have strong opinions
on MailScanner versus Amavisd-new for use with postfix (and of
course SpamAssassin and ClamAV)?

In the old days it seemed Amavisd-new may have integrated better
with postfix, but is that no longer the case?  Some folks say
MailScanner is faster and leaner.

What gives?

Jeff C.

Jeff

can't say I've compared the two, but I run MailScanner and it does have 
a couple of neat features recently - it's own MD5 cache of recent spam 
which speeds things up alot, and the inbuilt phishing testing (yeah ok 
this has been in a while).


it also glues  SA, 12 anti-virus engines, and it's own tests (like 
executables which has saved me a few times before the av people have 
updates).


horses for courses, but it's nice to have a choice of amavis-new OR 
MailScanner.


--
Martin Hepworth
Senior Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300

**

This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote confirms that this email message has been swept
for the presence of computer viruses and is believed to be clean.   

**



RE: MailScanner versus Amavisd-new with postfix

2006-10-27 Thread Dan Horne
 

 -Original Message-
 From: Jeff Chan [mailto:[EMAIL PROTECTED] 
 Sent: Friday, October 27, 2006 9:54 AM
 To: SpamAssassin Users
 Subject: MailScanner versus Amavisd-new with postfix
 
 Not to start any flamewars, but does anyone have strong 
 opinions on MailScanner versus Amavisd-new for use with 
 postfix (and of course SpamAssassin and ClamAV)?
 
 In the old days it seemed Amavisd-new may have integrated 
 better with postfix, but is that no longer the case?  Some 
 folks say MailScanner is faster and leaner.
 
 What gives?
 
 Jeff C.
 --
 Jeff Chan
 mailto:[EMAIL PROTECTED]
 http://www.surbl.org/
 
 

Wietse Venema says that MailScanner uses unsupported methods to
manipulate the queue that could (and has) lead to lost email.  I don't
know the full details, but it has been discussed much on the postfix
list.  My impression is that the condition is rare, but it does happen.

Just a heads up.

-DH

CONFIDENTIALITY NOTICE:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized review, use, disclosure or distribution is prohibited. If you 
are not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message.
 
SPAM-FREE 1.0(2476)




RE: MailScanner versus Amavisd-new with postfix

2006-10-27 Thread Kurt Buff
note: I don't use mailscanner, so am only relaying what I saw on the postfix
list.

My understanding (based on foggy memory - search the list archives for a
better answer) is that MailScanner dipped into postfix queues using either
undocumented postfix APIs or by bypassing postfix entirely and directly
manipulating files on disk. This led to instances of documented mail loss.
Wietse therefore said that it wasn't safe to use.

I've also recently read (I believe also on the postfix list, but am not
sure) that MailScanner has remedied this behavior, and that it is now safe
to use with postfix, but you'll need to confirm for yourself if that is
true.

Kurt

| -Original Message-
| From: Jeff Chan [mailto:[EMAIL PROTECTED]
| Sent: Friday, October 27, 2006 06:54
| To: SpamAssassin Users
| Subject: MailScanner versus Amavisd-new with postfix
| 
| 
| Not to start any flamewars, but does anyone have strong opinions
| on MailScanner versus Amavisd-new for use with postfix (and of
| course SpamAssassin and ClamAV)?
| 
| In the old days it seemed Amavisd-new may have integrated better
| with postfix, but is that no longer the case?  Some folks say
| MailScanner is faster and leaner.
| 
| What gives?
| 
| Jeff C.
| -- 
| Jeff Chan
| mailto:[EMAIL PROTECTED]
| http://www.surbl.org/
| 


  



RE: High CPU running SA in a VMware VM

2006-10-27 Thread Mark


 -Original Message-
 From: Anders Norrbring [mailto:[EMAIL PROTECTED]
 Sent: vrijdag 27 oktober 2006 20:58
 To: users@spamassassin.apache.org
 Subject: Re: High CPU running SA in a VMware VM


 I run my SMTP server entirely in a VMware VM, and have *never* seen a
 high CPU usage on that particular machine. I run Postfix, Amavis-new
 2.4.3, SA 3.1.7 and quite some plug-ins.

 Bayes and quarantine are all in a MySQL database stored on
 another VM, no big load there either...

I concur. I've been using Vmware, as a shadow/test server, for the
production FreeBSD one, for years; never had any such issue.
Vmware rocks! :)

I would run any of the db_dump or db_upgrade utils for BerkeleyDB; or
reinstall DB_File (and make darn sure it's compiled against the correct
BerkeleyDB libs). At any rate, I myself would probably be more inclined to
look into a BerkeleyDB issue than a Vmware one.

- Mark



Re: ImageInfo vs FuzzyOCR performance?

2006-10-27 Thread Jorge Valdes

Jeff Chan wrote:

Does anyone have any recent feedback about the performance of
ImageInfo versus FuzzyOCR about detecting stock image spams (or
any others)?  Does FuzzyOCR catch significantly more spams than
ImageInfo?

Cheers,

Jeff C.
  
I maybe biased, as I help in FuzzyOcr development, but do use both.  
ImageInfo is fine and will get you part of the way there, but FuzzyOcr 
hits more often. Daily scanning ~8Kmsg/day, FuzzyOcr hits ~1600 times 
and ImageInfo hits  150 times on average. On my system, here are the 
top10 rule hits from yesterday:


SPAM Results:
  3936 Message(s) 49.83%
19.399 Average Score

  3343 Time(s)7.50%   84.93% Hit Rule: BAYES_99
  3068 Time(s)6.88%   77.95% Hit Rule: HTML_MESSAGE
  1655 Time(s)3.71%   42.05% Hit Rule: FUZZY_OCR
  1527 Time(s)3.42%   38.80% Hit Rule: SARE_GIF_ATTACH
  1411 Time(s)3.16%   35.85% Hit Rule: URIBL_BLACK
  1274 Time(s)2.86%   32.37% Hit Rule: URIBL_BLACK_OVERLAP
  1271 Time(s)2.85%   32.29% Hit Rule: MIME_HTML_ONLY
  1215 Time(s)2.72%   30.87% Hit Rule: URIBL_JP_SURBL
  1187 Time(s)2.66%   30.16% Hit Rule: RCVD_IN_BL_SPAMCOP_NET
  1184 Time(s)2.66%   30.08% Hit Rule: SARE_GIF_STOX


Jorge Valdes




Re: domainkeys unverified - solved

2006-10-27 Thread Chris Purves

Chris Purves wrote:
I just got the domainkeys plugin set up, but it's not working the way I 
expect.


In messages from Yahoo I see:

0.0 DK_SIGNED Domain Keys: message has an unverified signature

but I never see DK_VERIFIED

Is there something I need to configure?  I didn't apply the patch, 
because I'm assuming it's been incorporated into 3.1.4.




In the end, with the help of Mark Martinec, I was able to determine that 
the problem was with my ISP provided DNS namerservers not allowing full 
TXT records to be returned (they were truncated).


I installed bind9 and used localhost as my primary nameserver and now I 
can get DK_VERIFIED.



Symptoms for this problem were:

DK_VERIFIED does not fire for Yahoo! e-mails (multiple part TXT record)
DK_VERIFIED does fire for Gmail e-mail (single part TXT record)
Perl modules Mail::DomainKeys and Mail::DKIM will fail during make test



--
Chris



Re: Scoring base64 blob messages

2006-10-27 Thread Peter H. Lemieux

Theo Van Dinter wrote:

On Thu, Oct 26, 2006 at 12:19:23PM -0400, Peter H. Lemieux wrote:

No, because there are going to be a lot of mails that would hit that.
Really?  Maybe it's because I live in the US, but I can't think of a 
legitimate message I've ever received consisting only of a base64 blob. 


You look at a lot of raw messages?  ;)


Doesn't everybody?

Seriously, I do look at a lot of raw messages; for instance, I review the 
full text of nearly every spam message that doesn't get caught by my 
filters and shows up in my inbox.  Obviously I don't get much mail from 
Blackberry users or Ticketmaster!


Rather than making anyone else do the work for me, is there something I 
can read about how to determine the frequency of different message 
features appearing in the corpus?



Well, there isn't a SA corpus, so there's no answer to that question.


Ah, I hadn't read this page before:
http://wiki.apache.org/spamassassin/HandClassifiedCorpora
My recollection was that 2.x used a centrally-defined corpus rather than 
a variety of developers' corpora (see, I read the wiki).  Either things 
changed with the switch in scoring algorithms in 3.x, or my recollection 
is shoddy.  Probably the latter.



You can generate some rules and use mass-check to run against your own corpus
to gather some statistics.  I'm willing to run some rules for you against my
corpus if you want.  I just don't have time to come up with the rules right
now.


Thanks for the offer, Theo, but don't spend your valuable time on this. 
I'll give it shot some day when I've got some spare moments.  If I do get 
some candidate rules, I'll pass them along to you for testing.



Thanks again!
Peter


Re: Scoring base64 blob messages

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 05:24:58PM -0400, Peter H. Lemieux wrote:
 Well, there isn't a SA corpus, so there's no answer to that question.
 
 Ah, I hadn't read this page before:
   http://wiki.apache.org/spamassassin/HandClassifiedCorpora
 My recollection was that 2.x used a centrally-defined corpus rather than 
 a variety of developers' corpora (see, I read the wiki).  Either things 
 changed with the switch in scoring algorithms in 3.x, or my recollection 
 is shoddy.  Probably the latter.

Yeah, sorry.  We've had separate corpora since I started with SA several years
ago.  There was a public corpus of mail made available which could be
confusing your memory. :)

-- 
Randomly Selected Tagline:
I pity the shul that won't let Krusty in now. Spin me clown!
 - Mr. T, The Simpsons, Today, I Am a Klown


pgp927l5OrmB0.pgp
Description: PGP signature


Re: domainkeys unverified - solved

2006-10-27 Thread Peter H. Lemieux

Chris Purves wrote:
In the end, with the help of Mark Martinec, I was able to determine that 
the problem was with my ISP provided DNS namerservers not allowing full 
TXT records to be returned (they were truncated).


Was this something that the ISP cooked up, or was it intrinsic to the DNS 
server software they are using?  If the latter, it would be good to know 
which server they were running.  It might be a useful addition to the 
FAQ/wiki.


Peter



Re: High CPU running SA in a VMware VM

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 09:10:28PM +, Mark wrote:
  I run my SMTP server entirely in a VMware VM, and have *never* seen a
  high CPU usage on that particular machine. I run Postfix, Amavis-new
  2.4.3, SA 3.1.7 and quite some plug-ins.
 
 I would run any of the db_dump or db_upgrade utils for BerkeleyDB; or
 reinstall DB_File (and make darn sure it's compiled against the correct
 BerkeleyDB libs). At any rate, I myself would probably be more inclined to
 look into a BerkeleyDB issue than a Vmware one.

Yeah, I doubt there's an issue with VMware specifically (ESX++).  My guess is
that if you're seeing different behavior between a physical host and virtual
host, there's something different in the virtual host -- different OS, libs,
perl modules, etc.

Obviously that won't be the case if you virtualized a physical machine, but I
seem to recall from the start of the thread that you migrated the data but not
the OS.

-- 
Randomly Selected Tagline:
My wife and I were happy for years.  Then we met.


pgpofuBWMG1My.pgp
Description: PGP signature


Re: High CPU running SA in a VMware VM

2006-10-27 Thread Sammy Anderson
You are correct, this was a new build, with a later version of SA and  migrated Bayes files. It could very well be the case that  Berkeley DB needs to be patched, or the data converted in some fashion.I will say that in a VM environment, we tried to build gcc, and it took  MUCH longer than on a physical box with the same processors.  VMware analyzed our data, and they determined that we should disable  NPTL and use LinuxThreads instead (kb 1470). This did help  substantially, and though slower than the physical machine, it was  acceptable. I have tried this for SA, and it does seem to cut down the CPU required, so there is some hope.Theo Van Dinter [EMAIL PROTECTED] wrote:  On Fri, Oct 27, 2006 at 09:10:28PM +, Mark wrote:  I run my SMTP server entirely in a VMware VM, and
 have *never* seen a  high CPU usage on that particular machine. I run Postfix, Amavis-new  2.4.3, SA 3.1.7 and quite some plug-ins.  I would run any of the "db_dump" or db_upgrade" utils for BerkeleyDB; or reinstall DB_File (and make darn sure it's compiled against the correct BerkeleyDB libs). At any rate, I myself would probably be more inclined to look into a BerkeleyDB issue than a Vmware one.Yeah, I doubt there's an issue with VMware specifically (ESX++).  My guess isthat if you're seeing different behavior between a physical host and virtualhost, there's something different in the virtual host -- different OS, libs,perl modules, etc.Obviously that won't be the case if you virtualized a physical machine, but Iseem to recall from the start of the thread that you migrated the data but notthe OS.-- Randomly Selected Tagline:My wife and I were happy for
 years.  Then we met. 
		 All-new Yahoo! Mail - Fire up a more powerful email and get things done faster.

Re: domainkeys unverified - solved

2006-10-27 Thread Justin Mason

Peter H. Lemieux writes:
 Chris Purves wrote:
  In the end, with the help of Mark Martinec, I was able to determine that 
  the problem was with my ISP provided DNS namerservers not allowing full 
  TXT records to be returned (they were truncated).
 
 Was this something that the ISP cooked up, or was it intrinsic to the DNS 
 server software they are using?  If the latter, it would be good to know 
 which server they were running.  It might be a useful addition to the 
 FAQ/wiki.

yes, definitely -- this is worth knowing about...

--j.


Re: High CPU running SA in a VMware VM

2006-10-27 Thread Sammy Anderson
I manually ran sa-learn --force-expire, and it hammered the box.  Here is a debug and timing information (for just a 5 MB file!):[18002] dbg: bayes: tie-ing to DB file R/O /home/ian/.spamassassin/bayes_toks  [18002] dbg: bayes: tie-ing to DB file R/O /home/ian/.spamassassin/bayes_seen  [18002] dbg: bayes: found bayes db version 3  [18002] dbg: bayes: DB journal sync: last sync: 1161899721  [18002] dbg: bayes: opportunistic call found journal sync due  [18002] dbg: bayes: bayes journal sync starting  [18002] dbg: bayes: tie-ing to DB file R/W /home/ian/.spamassassin/bayes_toks  [18002] dbg: bayes: tie-ing to DB file R/W /home/ian/.spamassassin/bayes_seen  [18002] dbg: bayes: found bayes db version 3  [18002] dbg: bayes: synced databases from journal in 0 seconds: 792 unique entries (974 total entries)  [18002] dbg: bayes: bayes journal sync completed  [18002] dbg: bayes: bayes journal sync starting  [18002]
 dbg: bayes: bayes journal sync completed  [18002] dbg: bayes: expiry starting  [18002] dbg: bayes: expiry check keep size, 0.75 * max: 112500  [18002] dbg: bayes: token count: 161725, final goal reduction size: 49225  [18002] dbg: bayes: first pass? current: 1161986180, Last: 1161862273,  atime: 691200, count: 10015, newdelta: 140627, ratio: 4.91512730903645,  period: 43200  [18002] dbg: bayes: can't use estimation method for expiry, unexpected result, calculating optimal atime delta (first pass)  [18002] dbg: bayes: expiry max exponent: 9  -- about 20 seconds elapsed  [18002] dbg: bayes: atime token reduction  [18002] dbg: bayes:  ===  [18002] dbg: bayes: 43200 144256  [18002] dbg: bayes: 86400 133029  [18002] dbg: bayes: 172800 111350  [18002] dbg: bayes: 345600 72306  [18002] dbg: bayes: 691200 9457  [18002] dbg: bayes: 1382400 0  [18002] dbg: bayes: 2764800 0  [18002] dbg:
 bayes: 5529600 0  [18002] dbg: bayes: 11059200 0  [18002] dbg: bayes: 22118400 0  [18002] dbg: bayes: first pass decided on 691200 for atime delta  -- about 40 seconds elapsed [a sort going on here???]  [18002] dbg: bayes: untie-ing  [18002] dbg: bayes: untie-ing db_toks  [18002] dbg: bayes: untie-ing db_seen  [18002] dbg: bayes: files locked, now unlocking lock  expired old bayes database entries in 60 seconds = YIKES  152268 entries kept, 9457 deleted  token frequency: 1-occurrence tokens: 68.79%  token frequency: less than 8 occurrences: 18.63%  [18002] dbg: bayes: expiry completed  .  real 1m6.157s  user 0m56.044s = WOW!  sys 0m2.370sAnders Norrbring [EMAIL PROTECTED] wrote: 
 Sorry about top-posting, but I just catched the topic, and found it a bit interesting...I run my SMTP server entirely in a VMware VM, and have *never* seen a high CPU usage on that particular machine.  I run Postfix, Amavis-new 2.4.3, SA 3.1.7 and quite some plug-ins.Bayes and quarantine are all in a MySQL database stored on another VM, no big load there either...At peaks, I have a 2-4% CPU usage and 20-65% memory usage on eash VM, all reported by Virtual Center 1.4.So, naturally I'm curious about why there would be a high CPU load from using SA My guess is that it's something else causing it.-- Anders NorrbringNorrbring ConsultingSammy Anderson skrev: I'm pretty sure it is that, because when I turn of bayes altogether, the  spikes go away.  I also ran sa-learn --force-expire and it PEGS the VM.   With bayes debugging enabled, I see lines like this in my
 syslog:  bayes: expired old bayes database entries in 236 seconds: 152268 entries  kept, 9457 deleted  We have about 140 users, each with a 5 MB bayes_toks file, so there is a  need to expire somebody all throughout the day.  Each user is virtual,  they don't really have an account on the box, but the directories  correspond to each user address.  And we do auto-learn, with  opportunistic expiry.  Good thought about --round-robin, I am willing to use a little more  memory if it saves on CPU.  */"Ring, John C" /* wrote:   From: Sammy Anderson [mailto:[EMAIL PROTECTED]We recently migrated our SpamAssassin installation from a physical 3.6 GHz system  running RHEL 4 and SA 3.0.4 to a VMware VM (ESX 2.5.4) with RHEL 4 as the guest
 OS  and SA 3.1.7.  I just did the same thing last week, except we're using RHEL 3 and ESX 2.5.2, and the physical box it used to be on was far less powerful then yours.   Each user has their own Bayes files (Berkeley DB) and these were copied from the old to  the new server. Now whenever an expiry process runs on a user's database, the CPU  spikes, sometimes for a minute or longer.  Hmm. We're using ours as a site-wide MTA to be able to reject incoming mails at SMTP time, so no user DBs on the box, but we are running with Bayes checking on (Berkeley DB), autolearning off, and manual Bayes feeding only a few times a day. Because of that, I don't have practice with a heavy Bayes load, but how certain are you that it's Bayes hitting the CPU; did
 you 

RE: domainkeys unverified - solved

2006-10-27 Thread Mark


 -Original Message-
 From: Chris Purves [mailto:[EMAIL PROTECTED]
 Sent: vrijdag 27 oktober 2006 23:20
 To: users@spamassassin.apache.org
 Subject: Re: domainkeys unverified - solved


 In the end, with the help of Mark Martinec, I was able to
 determine that the problem was with my ISP provided DNS
 namerservers not allowing full TXT records to be returned
 (they were truncated).

 Symptoms for this problem were:

 DK_VERIFIED does not fire for Yahoo! e-mails (multiple part
 TXT record)

Interesting.

nslookup -q=txt lima._domainkey.yahoogroups.com

k=rsa;
p=MHwwDQYJKoZIhvcNAQEBBQADawAwaAJhAL10WHRWMSb9Tnl+k4Kzpc18rDCTpDT1pbK0xwkd
ZIZkaP8NB75qa/S57xccZlIwbI22Ooy/IY+8WxQtvE2z4W
LLNOf9hkMeicUH48TGkEoCAcaSjJz/b3NMrOy9l1U7gQIDAP//

I get two parts, too. Is that their correct public key, when concatinated?
Though I do not get both parts in random order, I wonder if I would not
have the same issue, then.

- Mark



Re: High CPU running SA in a VMware VM

2006-10-27 Thread Theo Van Dinter
On Fri, Oct 27, 2006 at 03:01:45PM -0700, Sammy Anderson wrote:
 I manually ran sa-learn --force-expire, and it hammered the box.   Here is a 
 debug and timing information (for just a 5 MB file!):
   
   [18002] dbg: bayes: token count: 161725, final goal reduction size: 49225

want to get rid of (max) 49225 tokens

   [18002] dbg: bayes: can't use estimation method for expiry, unexpected 
 result, calculating optimal atime delta (first pass)

have to do step 1 and can't estimate

   [18002] dbg: bayes: expiry max exponent: 9
   -- about 20 seconds elapsed

it's going through every token in your db

   [18002] dbg: bayes: atime token reduction
   [18002] dbg: bayes:  ===
   [18002] dbg: bayes: 43200 144256
   [18002] dbg: bayes: 86400 133029
   [18002] dbg: bayes: 172800 111350
   [18002] dbg: bayes: 345600 72306
   [18002] dbg: bayes: 691200 9457
   [18002] dbg: bayes: 1382400 0
[...]
   [18002] dbg: bayes: first pass decided on 691200 for atime delta

691200 wins the Price Is Right (9457 is the closest without going over)

   -- about 40 seconds elapsed [a sort going on here???]

It's creating a new DB file, going back through every token in the original
DB, and for any that are newer than 9457 seconds ago, it copies the entry to
the new DB.

   expired old bayes database entries in 60 seconds = YIKES

yep.  expiry is relatively resource intensive and slow w/ DBMs, but
there's no other good way to do it (or at least, no one has suggested
a really better way to do it...)

-- 
Randomly Selected Tagline:
I believe it's not butter, I just can't believe it's $1.59!


pgpFcu5EsuOzk.pgp
Description: PGP signature


Re: Rules to reject bounce messages for mail not sent by me

2006-10-27 Thread Jo Rhett

On Oct 27, 2006, at 3:58 AM, Justin Mason wrote:

Nick Gilbert writes:

PS. Will setting up SPF on my domain name have any effect for things
like this? Will it discourage spammers from using my domain or reduce
the number of bounce messages I/we get?


nope.  they don't bother checking, and the systems sending bounces
aren't the ones that are being kept up-to-date enough to check SPF
either.


Umm... not in my experience.  Every time we turn on SPF for a domain,  
the amount of backscatter goes to about a third of the previous  
amount.  Every time I've been involved anyway.


--
Jo Rhett
Senior Network Engineer
Network Consonance



Re: domainkeys unverified - solved

2006-10-27 Thread Chris Purves

Mark wrote:



-Original Message-
From: Chris Purves [mailto:[EMAIL PROTECTED]
Sent: vrijdag 27 oktober 2006 23:20
To: users@spamassassin.apache.org
Subject: Re: domainkeys unverified - solved


In the end, with the help of Mark Martinec, I was able to
determine that the problem was with my ISP provided DNS
namerservers not allowing full TXT records to be returned
(they were truncated).



Symptoms for this problem were:

DK_VERIFIED does not fire for Yahoo! e-mails (multiple part
TXT record)


Interesting.

nslookup -q=txt lima._domainkey.yahoogroups.com

k=rsa;
p=MHwwDQYJKoZIhvcNAQEBBQADawAwaAJhAL10WHRWMSb9Tnl+k4Kzpc18rDCTpDT1pbK0xwkd
ZIZkaP8NB75qa/S57xccZlIwbI22Ooy/IY+8WxQtvE2z4W
LLNOf9hkMeicUH48TGkEoCAcaSjJz/b3NMrOy9l1U7gQIDAP//

I get two parts, too. Is that their correct public key, when concatinated?
Though I do not get both parts in random order, I wonder if I would not
have the same issue, then.


What you get is correct.  In my case, when it's not working I get:

[EMAIL PROTECTED]:~$ nslookup -q=txt lima._domainkey.yahoogroups.com
Server: 64.59.184.13
Address:64.59.184.13#53

Non-authoritative answer:
lima._domainkey.yahoogroups.com text = k=rsa\; 
p=MHwwDQYJKoZIhvcNAQEBBQADawAwaAJhAL10WHRWMSb9Tnl+k4Kzpc18rDCTpDT1pbK0xwkdZIZkaP8NB75qa/S57xccZlIwbI22Ooy/IY+8WxQtvE2z4W


Authoritative answers can be found from:

[EMAIL PROTECTED]:~$

I'm missing the second part of the Answer and Authority is empty. 
Using dig -t txt ... the Additional section is also emtpy.


--
Chris



Re: domainkeys unverified - solved

2006-10-27 Thread Chris Purves

Peter H. Lemieux wrote:

Chris Purves wrote:
In the end, with the help of Mark Martinec, I was able to determine 
that the problem was with my ISP provided DNS namerservers not 
allowing full TXT records to be returned (they were truncated).


Was this something that the ISP cooked up, or was it intrinsic to the 
DNS server software they are using?  If the latter, it would be good to 
know which server they were running.  It might be a useful addition to 
the FAQ/wiki.



I still have to contact them, but I'll post back with my results.


--
Chris



Re: High CPU running SA in a VMware VM

2006-10-27 Thread Sammy Anderson
And there is one of these for each user, this is just for one  user. Sounds like we may have to abandon Bayes or possibly use  mysql. Not sure we are ready to invest in setting that all up...Theo Van Dinter [EMAIL PROTECTED] wrote:  On Fri, Oct 27, 2006 at 03:01:45PM -0700, Sammy Anderson wrote:  I manually ran sa-learn --force-expire, and it hammered the box. Here  is a debug and timing information (for just a 5 MB file!):  [18002] dbg: bayes: token count: 161725, final goal reduction size: 49225want to get rid of (max) 49225 tokens  [18002] dbg: bayes: can't use estimation method for expiry, unexpected  result, calculating optimal atime delta (first pass)have to do step 1 and can't estimate   [18002] dbg: bayes: expiry max exponent: 9   --
 about 20 seconds elapsedit's going through every token in your db   [18002] dbg: bayes: atime token reduction   [18002] dbg: bayes:  ===   [18002] dbg: bayes: 43200 144256   [18002] dbg: bayes: 86400 133029   [18002] dbg: bayes: 172800 111350   [18002] dbg: bayes: 345600 72306   [18002] dbg: bayes: 691200 9457   [18002] dbg: bayes: 1382400 0[...]   [18002] dbg: bayes: first pass decided on 691200 for atime delta691200 wins the Price Is Right (9457 is the closest without going over)   -- about 40 seconds elapsed [a sort going on here???]It's creating a new DB file, going back through every token in the originalDB, and for any that are newer than 9457 seconds ago, it copies the entry tothe new DB.   expired old bayes database entries in 60 seconds = YIKESyep.  expiry is relatively resource intensive and
 slow w/ DBMs, butthere's no other good way to do it (or at least, no one has suggesteda really better way to do it...)-- Randomly Selected Tagline:I believe it's not butter, I just can't believe it's $1.59! 

Get your email and see which of your friends are online - Right on the  new Yahoo.com


RE: domainkeys unverified - solved

2006-10-27 Thread Mark


 -Original Message-
 From: Chris Purves [mailto:[EMAIL PROTECTED] 
 Sent: zaterdag 28 oktober 2006 0:49
 To: users@spamassassin.apache.org
 Subject: Re: domainkeys unverified - solved
 
 
  DK_VERIFIED does not fire for Yahoo! e-mails (multiple part
  TXT record)
  
  Interesting.
  
  nslookup -q=txt lima._domainkey.yahoogroups.com
  
  k=rsa;
  
  p=MHwwDQYJKoZIhvcNAQEBBQADawAwaAJhAL10WHRWMSb9Tnl+k4Kzpc18rDCT
  pDT1pbK0xwkd
  ZIZkaP8NB75qa/S57xccZlIwbI22Ooy/IY+8WxQtvE2z4W
  LLNOf9hkMeicUH48TGkEoCAcaSjJz/b3NMrOy9l1U7gQIDAP//
  
  I get two parts, too. Is that their correct public key, 
  when concatinated?

 What you get is correct. In my case, when it's not working I get:
 
 [EMAIL PROTECTED]:~$ nslookup -q=txt lima._domainkey.yahoogroups.com
 Server: 64.59.184.13
 Address:64.59.184.13#53
 
 Non-authoritative answer:
 lima._domainkey.yahoogroups.com text = k=rsa\; 
 p=MHwwDQYJKoZIhvcNAQEBBQADawAwaAJhAL10WHRWMSb9Tnl+k4Kzpc18rDCT
 pDT1pbK0xwkdZIZkaP8NB75qa/S57xccZlIwbI22Ooy/IY+8WxQtvE2z4W
 
 Authoritative answers can be found from:
 
 [EMAIL PROTECTED]:~$
 
 I'm missing the second part of the Answer and Authority is empty.

Thanks. :) I was getting worried. I'm not quite ready to go to BIND 9 yet
(don't all y'all shoot me now), so I'm happy to hear it's working.

- Mark



Re: High CPU running SA in a VMware VM

2006-10-27 Thread Rick Macdougall

Sammy Anderson wrote:

And there is one of these for each user, this is just for one  user.  Sounds 
like we may have to abandon Bayes or possibly use  mysql.  Not sure we are 
ready to invest in setting that all up...



Bayes in MySQL is a snap to setup and it really runs rings around the 
dbm setup in a real world situation.


I switched over two clients this morning and neither of them had MySQL 
installed.  Installed from source (php 5 requirements etc) and still had 
both installs done before lunch.


Regards,

Rick



RE: ImageInfo vs FuzzyOCR performance?

2006-10-27 Thread Michael Scheidell
 -Original Message-
 From: Jorge Valdes [mailto:[EMAIL PROTECTED] 
 Sent: Friday, October 27, 2006 5:12 PM
 To: users@spamassassin.apache.org
 Subject: Re: ImageInfo vs FuzzyOCR performance?
 
  SPAM Results:
3936 Message(s) 49.83%
  19.399 Average Score
  
3343 Time(s)7.50%   84.93% Hit Rule: BAYES_99
3068 Time(s)6.88%   77.95% Hit Rule: HTML_MESSAGE
1655 Time(s)3.71%   42.05% Hit Rule: FUZZY_OCR
1527 Time(s)3.42%   38.80% Hit Rule: SARE_GIF_ATTACH
1411 Time(s)3.16%   35.85% Hit Rule: URIBL_BLACK
1274 Time(s)2.86%   32.37% Hit Rule: URIBL_BLACK_OVERLAP
1271 Time(s)2.85%   32.29% Hit Rule: MIME_HTML_ONLY
1215 Time(s)2.72%   30.87% Hit Rule: URIBL_JP_SURBL
1187 Time(s)2.66%   30.16% Hit Rule: RCVD_IN_BL_SPAMCOP_NET
1184 Time(s)2.66%   30.08% Hit Rule: SARE_GIF_STOX
 
What do you use to get those stats?


RE: ImageInfo vs FuzzyOCR performance?

2006-10-27 Thread Rob McEwen
Jeff Chan wrote:
 Does anyone have any recent feedback about the performance of
 ImageInfo versus FuzzyOCR about detecting stock image spams (or
 any others)?  Does FuzzyOCR catch significantly more spams than
 ImageInfo?

But one of the things that ImageInfo does to avoid FPs is assign a higher
score to image-only spam where the ratio of screen-space/amount-of-text is
high. But notice how more of this type of spam lately has more gibberish
text at the bottom lately? This messes that formula up and creates a VERY
small ImageInfo score. I know that the spammers might have been doing this
to get around bayes... but I suspect that they were really trying to get
around ImageInfo because this change-up seemed to happen soon after
ImageInfo was introduced.

Nevertheless, I've found that manually readjusting those ratios has helped
to catch more spam. (And I'm reluctant to mention this in the first place
because if they are adjusted at the SARE site, then the spammers will only
readjust accordingly!)

Rob McEwen
PowerView Systems



SA TIMED OUT

2006-10-27 Thread M. Lewis


I upgraded to SA 3.1.4 last night and now I have two issues that I'm 
trying to resolve:


(1)
spamassassin -D --lint is giving me an error:
[2533] warn: config: failed to parse line, skipping: dcc_timeout 18


(2)
In the logs I'm seeing a good number of the following type of entry:
Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, 
backtrace: at 
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/DnsResolver.pm line 
363\n\teval {...} called at 
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/DnsResolver.pm line 
363\n\tMail::SpamAssassin::DnsResolver::poll_responses('Mail::SpamAssassin::DnsResolver=HASH(0x4005820)', 
72) called at 
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Plugin/URIDNSBL.pm 
line 
710\n\tMail::SpamAssassin::Plugin::URIDNSBL::complete_lookups('Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x1ff8200)', 
'HASH(0x4cbdad0)', 72) called at 
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Plugin/URIDNSBL.pm 
line 
412\n\tMail::SpamAssassin::Plugin::URIDNSBL::check_post_dnsbl('Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x1ff8200)', 
'HASH(0x6816dd0)') called at 
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/PluginHandler.pm line 
159\n\teval {...} called at /usr/lib/perl5/vendor_perl/5.8.8/Mail/Sp...



I've checked the archives and maybe I missed something, but I wasn't 
able to find anything that seemed relavent.


Thanks for any pointers.
Mike


[EMAIL PROTECTED] ~]# spamassassin -V
SpamAssassin version 3.1.4
  running on Perl version 5.8.8

--

 Let the machine do the dirty work.  - Elements of Programming Style
  15:35:01 up 16:21,  0 users,  load average: 0.32, 0.31, 0.28

 Linux Registered User #241685  http://counter.li.org


Re: MailScanner versus Amavisd-new with postfix

2006-10-27 Thread Mark Martinec
Jeff,

 Not to start any flamewars, but does anyone have strong opinions
 on MailScanner versus Amavisd-new for use with postfix (and of
 course SpamAssassin and ClamAV)?

Of course I'm biased, but I'd be worried running program with
about 400 cases of calling system routines (I/O, file system, etc.)
without checking resulting status or failing to report errors.
MailScanner works while everything is in order. When unexpected
happens (e.g. disk full, I/O or file system errors, depleted system 
resources), then unpredictable things are bound to result, and
possibly go by unnoticed for some time or prove difficult to diagnose.

  Mark


Re: spamassassin --lint fails with rules in local.cf

2006-10-27 Thread Alain Wolf
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 26.10.2006 14:35, * Dylan Bouterse wrote:
 I have added some rules in my local.cf file (for adding scores for some
 SARE rules) but when I run spamassassin -lint (or when I run
 rules_du_jour which does the same) it says the rules in my local.cf file
 are non-existent, but spamassassin ultimately runs fine. What am I doing
 wrong?
 
 Dylan
 
 

Oops, just stumbled upon the release announcemnet of SpamAssassin 3.1.7

http://www.nabble.com/ANNOUNCE%3A-Apache-SpamAssassin-3.1.7-available%21-tf2415849.html

3.1.7 is a quick-fix release; it contains only a fix for one bug,
introduced accidentally in 3.1.6:

- - bug 5119: if admins had set rule scores in the site configuration in
  /etc, sa-update would fail.  Back out this change

Don't know if Dylan is already using 3.1.7.

We are on 3.1.6 because there is no updated FreeBSD-Port out yet.
So I wait.

Greetings
Alain


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFQqajV5MZZmyxvGgRAncZAJwIvkSSCc3KX0jaMXxmVlQ3cYqZmgCgjFzS
ZXC3XFWGXreL8fc/c2lhoUg=
=aE61
-END PGP SIGNATURE-



Re: SA TIMED OUT

2006-10-27 Thread Matt Kettler
M. Lewis wrote:

 I upgraded to SA 3.1.4 last night and now I have two issues that I'm
 trying to resolve:

 (1)
 spamassassin -D --lint is giving me an error:
 [2533] warn: config: failed to parse line, skipping: dcc_timeout 18
If you've not edited /etc/mail/spamassassin/v310.pre to load the dcc
plugin, dcc is disabled by default (it's not free for everyone to use,
so disabled pending your decision that your use falls under DCC's
license.. most folks do, but check the license.

Without any DCC support loaded, the dcc_timeout option is meaningless to SA.



 (2)
 In the logs I'm seeing a good number of the following type of entry:
 Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT,
 backtrace: at
 /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/DnsResolver.pm line
 363\n\teval {...} called at


Sounds like your DNS is slow, and you've got a short  $sa_timeout in
your amavis configs. But I'm no amavis expert.



RE: SA TIMED OUT

2006-10-27 Thread Gary V
I upgraded to SA 3.1.4 last night and now I have two issues that I'm trying 
to resolve:


(1)
spamassassin -D --lint is giving me an error:
[2533] warn: config: failed to parse line, skipping: dcc_timeout 18



You need to enable (uncomment) the DCC plugin in v310.pre


(2)
In the logs I'm seeing a good number of the following type of entry:
Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, backtrace: 
at ...


I've checked the archives and maybe I missed something, but I wasn't able 
to find anything that seemed relavent.


Thanks for any pointers.
Mike


The newer version takes longer to scan (quite noticable on a low powered 
system). Newer versions of amavisd-new allow scans to take longer without 
timomg out where older versions have a default of $sa_timeout = 30; which 
should be included in amavisd.conf and raised to something like 60 seconds. 
I also suggest moving Bayes to SQL, and if not, then set lock_method = flock 
in local.cf if appropriate.

http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#miscellaneous_options

_
Try Search Survival Kits: Fix up your home and better handle your cash with 
Live Search! 
http://imagine-windowslive.com/search/kits/default.aspx?kit=improvelocale=en-USsource=hmtagline




RE: SA TIMED OUT

2006-10-27 Thread Gary V

spamassassin -D --lint is giving me an error:
[2533] warn: config: failed to parse line, skipping: dcc_timeout 18


BTW, as Matt says, your DNS may be slow. If DCC doesn't respond within 10 
seconds, I would imagine it's unlikely it will respond - so I wouldn't waste 
time waiting around another 8 seconds. Many people find a local caching DNS 
server really helps on net tests.


Gary V

_
Stay in touch with old friends and meet new ones with Windows Live Spaces 
http://clk.atdmt.com/MSN/go/msnnkwsp007001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=createwx_url=/friends.aspxmkt=en-us




Re: SA TIMED OUT

2006-10-27 Thread M. Lewis

Matt Kettler wrote:

M. Lewis wrote:

I upgraded to SA 3.1.4 last night and now I have two issues that I'm
trying to resolve:

(1)
spamassassin -D --lint is giving me an error:
[2533] warn: config: failed to parse line, skipping: dcc_timeout 18

If you've not edited /etc/mail/spamassassin/v310.pre to load the dcc
plugin, dcc is disabled by default (it's not free for everyone to use,
so disabled pending your decision that your use falls under DCC's
license.. most folks do, but check the license.

Without any DCC support loaded, the dcc_timeout option is meaningless to SA.



This was indeed the problem. Error gone now.





(2)
In the logs I'm seeing a good number of the following type of entry:
Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT,
backtrace: at
/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/DnsResolver.pm line
363\n\teval {...} called at



Sounds like your DNS is slow, and you've got a short  $sa_timeout in
your amavis configs. But I'm no amavis expert.


Actually I rebuilt this machine last night and forgot to turn on the 
cacheing NS. That made a difference!


Thanks Matt!


--

 May the bugs of many programs nest on your hard drive.
  22:45:01 up  3:13,  0 users,  load average: 0.10, 0.17, 0.17

 Linux Registered User #241685  http://counter.li.org


Re: SA TIMED OUT

2006-10-27 Thread M. Lewis


Gary V wrote:
I upgraded to SA 3.1.4 last night and now I have two issues that I'm 
trying to resolve:


(1)
spamassassin -D --lint is giving me an error:
[2533] warn: config: failed to parse line, skipping: dcc_timeout 18



You need to enable (uncomment) the DCC plugin in v310.pre


Done and the error is gone now.





(2)
In the logs I'm seeing a good number of the following type of entry:
Oct 27 15:40:21 moe amavis[2548]: (02548-01-2) (!)SA TIMED OUT, 
backtrace: at ...


I've checked the archives and maybe I missed something, but I wasn't 
able to find anything that seemed relavent.


Thanks for any pointers.
Mike


The newer version takes longer to scan (quite noticable on a low powered 
system). Newer versions of amavisd-new allow scans to take longer 
without timomg out where older versions have a default of $sa_timeout = 
30; which should be included in amavisd.conf and raised to something 
like 60 seconds. I also suggest moving Bayes to SQL, and if not, then 
set lock_method = flock in local.cf if appropriate.
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#miscellaneous_options 


Thanks Gary for the explanation. I will check into all of these.

Thanks,
Mike





_
Try Search Survival Kits: Fix up your home and better handle your cash 
with Live Search! 
http://imagine-windowslive.com/search/kits/default.aspx?kit=improvelocale=en-USsource=hmtagline 





--

 May the bugs of many programs nest on your hard drive.
  22:45:01 up  3:13,  0 users,  load average: 0.10, 0.17, 0.17

 Linux Registered User #241685  http://counter.li.org


Re: SA TIMED OUT

2006-10-27 Thread M. Lewis

Gary V wrote:

spamassassin -D --lint is giving me an error:
[2533] warn: config: failed to parse line, skipping: dcc_timeout 18


BTW, as Matt says, your DNS may be slow. If DCC doesn't respond within 
10 seconds, I would imagine it's unlikely it will respond - so I 
wouldn't waste time waiting around another 8 seconds. Many people find a 
local caching DNS server really helps on net tests.


Gary V


Yes, I have been using a caching NS prior to rebuilding the machine 
yesterday. I simply forgot to turn it on this time. Duh.


Thanks,
Mike


--

 IBM: Icons Bygones My Mom's
  22:50:01 up  3:18,  0 users,  load average: 0.53, 0.30, 0.22

 Linux Registered User #241685  http://counter.li.org