Problem with sa-compile

2009-03-24 Thread JC Putter
hi i am trying to compile my rules but i am getting the following error
 
Wide character in print at /usr/bin/sa-compile line 385, $fh line 3490.
Wide character in print at /usr/bin/sa-compile line 385, $fh line 6690.
re2c -i -b -o scanner1.c scanner1.re
re2c -i -b -o scanner2.c scanner2.re
re2c -i -b -o scanner3.c scanner3.re
re2c -i -b -o scanner4.c scanner4.re
re2c -i -b -o scanner5.c scanner5.re
re2c -i -b -o scanner6.c scanner6.re
 re2c -i -b -o scanner7.c scanner7.re
re2c -i -b -o scanner8.c scanner8.re
re2c -i -b -o scanner9.c scanner9.re
re2c -i -b -o scanner10.c scanner10.re
 re2c: error: line 102, column 2: Token exceeds limit
 command failed! at /usr/bin/sa-compile line 288, $fh line 7288.




__ Information from ESET NOD32 Antivirus, version of virus signature 
database 3956 (20090323) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



USER_IN_WHITELIST problem.

2009-03-24 Thread Bug
Dear users !

I`m using exim + spamd + user_prefs in mysql. All works fine.

But I found a bug, when I`m using whitelist, and header rcpt to:  have
address with character description, whitelist failed to catch it in
database. For example:

1st message:
spamd: clean message (-91.7/10.0) for t...@localdomain.com:501 in 8.2 seconds, 
13522829 bytes.

2nd message:
spamd: identified spam (9.2/5.0) for some text t...@localdomain.com:501 in 
8.3 seconds, 14874071 bytes.



Where some text for example User Name in address book of senders
mail agent.


Mysql userpref struct:
 username | preference| value  | prefid
 ^^
 test | whitelist_from| sen...@hidden.com  | 1


 Mysql database query in sql.cf:
 user_scores_sql_custom_query SELECT preference, value FROM _TABLE_ WHERE 
username IN (_USERNAME_, '$GLOBAL', CONCAT(_MAILBOX_, '@' , 
_DOMAIN_),SUBSTRING_INDEX(_USERNAME_, '@', 1)) ORDER BY username ASC


All works fine, when address of recipient in incoming letter is canonical like 
t...@localdomain.com
How can I fix this ?


Thanks.
Wbr,
Steve




Re: USER_IN_WHITELIST problem.

2009-03-24 Thread Matt Kettler
Bug wrote:
 Dear users !

 I`m using exim + spamd + user_prefs in mysql. All works fine.

 But I found a bug, when I`m using whitelist, and header rcpt to:  have
 address with character description, whitelist failed to catch it in
 database. For example:

 1st message:
 spamd: clean message (-91.7/10.0) for t...@localdomain.com:501 in 8.2 
 seconds, 13522829 bytes.

 2nd message:
 spamd: identified spam (9.2/5.0) for some text t...@localdomain.com:501 in 
 8.3 seconds, 14874071 bytes.



 Where some text for example User Name in address book of senders
 mail agent.


 Mysql userpref struct:
  username | preference| value  | 
 prefid
  
 ^^
  test | whitelist_from| sen...@hidden.com  | 1


  Mysql database query in sql.cf:
  user_scores_sql_custom_query SELECT preference, value FROM _TABLE_ WHERE 
 username IN (_USERNAME_, '$GLOBAL', CONCAT(_MAILBOX_, '@' , 
 _DOMAIN_),SUBSTRING_INDEX(_USERNAME_, '@', 1)) ORDER BY username ASC


 All works fine, when address of recipient in incoming letter is canonical 
 like t...@localdomain.com
 How can I fix this ?
   
Stop passing extra garbage in the -u parameter to spamc?

The some text part can't legally occur in a RCPT TO: command (which is
not a header). Did you mean are you extracting the entire contents of
the To: header?

spamc isn't designed to parse all that extra data off, username or
usern...@domain only.

I'd try to find a way to get the RCPT TO not the To: anyway. The To:
header might not contain the actual recipient and isn't a useful header
for selecting user prefs. (i.e.: posts sent to mailing lists are RCPT
TO: you, but they are To: the list) This is precisely why SA doesn't try
to parse the To: header and use that for selecting prefs.. It is often
misleading.










RE: Server overload, queuing for SA possible?

2009-03-24 Thread Bowie Bailey
Monky wrote:
 Hallo list,
 receiving a bunch of obvious spam emails without the SA tags in it
 made me look at my logfiles and I found out - thats what I guess -
 that for a short time my server was reaching his limits.
 Short grep extracts from my logfile:

 Mar 21 11:14:48 h1306680 spamd[9247]: prefork: child states: B
 Mar 21 11:15:31 h1306680 spamd[9247]: prefork: server reached
 --max-children setting, consider raising it

 Mar 21 11:11:48 h1306680 spamd[3550]: spamd: identified spam
 (32.3/5.0) for popuser:110 in 373.2 seconds, 17910 bytes.
 Mar 21 11:12:47 h1306680 spamd[30139]: spamd: identified spam
 (14.8/5.0) for popuser:110 in 412.3 seconds, 4164 bytes.

 What I make of this is that when my server is using his maximum of 5
 spamd children he hits the RAM limit and starts paging (the explosion
 of scanning time). Is this a sensible assessment?
 
 What could I do about it? Raising --maximum-children seems not a good
 idea. Actually it seems wiser to reduce to a maximum of 4 children.
 How can I prevent spam from passing my system unchecked due to a
 (temporary) overload? If I look at the prefork child states the
 critical time is followed by hours of II / BI. How could I queue the
 incoming emails and ensure that every email gets scanned?
 Currently I am using qmail's defaultdelivery setting:
  spamc | /usr/bin/deliverquota ./Maildir
 Additionally some users use procmail to filter spam.
 
 Any hints and help appretiated!

Your assessment sounds right to me.  I would make two suggestions.

1) Memory is cheap these days.  Add some more RAM.

2) Reduce the maximum children setting so that the system doesn't start
swapping.  This will cause SA to scan faster and should result in fewer
messages slipping through while SA is busy.

-- 
Bowie


Re: Server overload, queuing for SA possible?

2009-03-24 Thread Henrik K
On Sat, Mar 21, 2009 at 07:20:52AM -0700, Monky wrote:
 
 What I make of this is that when my server is using his maximum of 5 spamd
 children he hits the RAM limit and starts paging (the explosion of scanning
 time). Is this a sensible assessment?

How can we assess anything if you keep the crucial data to yourself? :)

Atleast give us output of free and ps axu..



ruleset

2009-03-24 Thread JC Putter
where can i find more rulesets? using openprotect sare rules and sought rulesets


__ Information from ESET NOD32 Antivirus, version of virus signature 
database 3957 (20090324) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



lookup user_prefs in SQL database (not using spamc)

2009-03-24 Thread Guido
Hi,

I am trying to configure my system that it can assign user specific
scores. I therefore set up a table like described in [1]. This runs fine, as
long as I use spamc to scan mails.

But actually I want to use Amavisd-new using spamassassin. Here
spamassassin complety ignores the sql settings. Other settings
in the local.cf are threaded correctly.

 - How can I convince spamassassin (used by amavisd-new) to care about my
   user_prefs in the database?

BTW: spamassassin should not try to search for user specific settings in
user's home directorys. Not all all. How can I do that?


Any hints would be appreciated.

Thank you


Guido

[1] http://wiki.apache.org/spamassassin/UsingSQL



Re: Trying to understand scoring discrepancy

2009-03-24 Thread Matus UHLAR - fantomas
On 23.03.09 09:43, mkellogg wrote:
 Running spamc from the command line and generating the full report.

  0.0 TVD_RCVD_IP4   TVD_RCVD_IP4
  0.0 TVD_RCVD_IPTVD_RCVD_IP

 However, my score file has the following lines:
 score TVD_RCVD_IP4 4.099 3.344 2.901 3.183 # n=2
 score TVD_RCVD_IP 0.502 1.617 2.270 1.931 # n=2

you apparently have a score line in user_prefs, or in system-wide
directory, which prevails over those in SA rules dirs.

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
There's a long-standing bug relating to the x86 architecture that
allows you to install Windows.   -- Matthew D. Fuller


Re: Spam Assassin White List

2009-03-24 Thread Matus UHLAR - fantomas
On 23.03.09 21:58, dsh979 wrote:
 I did not realise that items listed on the white list or the black list
 would still be subject to the operation/analysis of the SpamAssassin Rules.  

all rules are processed unless you play with ShortCircuit plugin. Beware of
that: It may render the SA useless if you don't knwo what you are doing.

 You have asked why I have set the required score the 100.  Lengthy
 explanation (sorry).  I have done this to prevent SpamAssassin from
 inserting SpamWarnings into the header/body of the relevant email.

There's report_safe option to configure that.

 In responding to spam I rely on the SpamAssassin Score in conjunction with
 other email message indicators), and incorporate these variables into a
 domain level filter (cPanel).

cpanel? In such case you apparently should direct your questions to cpanel
support (forum/list).

 Mail is then bounced (by the filter) without
 any warning in the bounced email itself, that it has been bounced because it
 has been identified as spam.  In fact, the bounced email will have a message
 inserted to the effect that there is no such user/receipient.  In this way,
 if there is a sender who receives the bounced email, hopefully they take me
 off their mailing list, instead of looking for a way to 'outsmart' the
 SpamRules.

bouncing sucks, bouncing spam is dangerous, since most of spam has false
return address so you are bouncing to innocent third party (which may cause
blogkinc your outgoing mail on blacklists). Reject unwanted the mail when it
comes, don't bounce, especially when you are sure it's spam

 Q:How can I list items/users on a white list or a black list without the
 lists (and items) being the subject of further analysis by the SpamAssassin
 Rules (and therefore obtaining the same score for each item on the relevant
 list, irrespective of the operation of the SpamAssassin Rules, that is
 -100=white list items  +100 = black list items)?

I somehow do not understand this question.
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
BSE = Mad Cow Desease ... BSA = Mad Software Producents Desease


Re: ruleset

2009-03-24 Thread Matus UHLAR - fantomas
On 24.03.09 15:59, JC Putter wrote:
 where can i find more rulesets? using openprotect sare rules and sought 
 rulesets

build your own rulesets? SARE rulesets aren't updated anymore afaik (and
thus number of false-positives is increasing).

Do you have any problem that can't be solved by fine-tuning SA?

-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety. -- Benjamin Franklin, 1759


Re: Using SpamAssassin for just the Bayesian filtering?

2009-03-24 Thread Matt Garretson
Randy J. Ray wrote:
 filtering on other content, filtering that isn't the same as spam-testing. In 
 a 
 nutshell, we currently use the bogofilter application to classify messages, 
 and invoke it with different word-list files to represent different filtering 
 requirements. But this isn't going to scale well for us as written, and I'm 
 the 


If you use sendmail, then consider doing everything from within mimedefang.
You can filter and molest messages as much as you'd like, with simple perl 
code. It'd be a very general solution, as you require.

Here, from within mimedefang, every message that reaches DATA phase goes 
through clamav, bogofilter (single word list), and then SpamAssassin if 
bogofilter didn't give a certain enough score.  Plus there's a bunch of 
other weird custom filtering going on. :)

I don't know how high you need to scale things, but the above works fine on
a smallish to medium scale (~200k messages a day). Adding more BF wordlists
probably wouldn't be a problem, given enough memory.




Re[2]: USER_IN_WHITELIST problem. SOLVED

2009-03-24 Thread Bug
Thank you Matt!


Your letter helped me to understand my problem better.

I`m not using sa-spamc, my exim using ACL spam, that connects directly
to spamd ip/port.

My founded solution was described in Exim FAQ:

A0512:  Envelope-To: is added at delivery time, by the transport.
Therefore, the header doesn't exist at filter time. In a user filter, the 
values you probably
want are in $original_local_part and $original_domain. In a system filter, the 
complete list
of all envelope recipients is in $recipients.

Incorrect lines - commented, now working config of ACL spam looks like:


  warn  message = X-Spam-Score: $spam_score ($spam_bar)
spam= $recipients
#   spam= $h_to
  warn  message = X-Spam-Report: $spam_report
spam= $recipients
#   spam= $h_to



 Stop passing extra garbage in the -u parameter to spamc?

 The some text part can't legally occur in a RCPT TO: command (which is
 not a header). Did you mean are you extracting the entire contents of
 the To: header?

 spamc isn't designed to parse all that extra data off, username or
 usern...@domain only.

 I'd try to find a way to get the RCPT TO not the To: anyway. The To:
 header might not contain the actual recipient and isn't a useful header
 for selecting user prefs. (i.e.: posts sent to mailing lists are RCPT
 TO: you, but they are To: the list) This is precisely why SA doesn't try
 to parse the To: header and use that for selecting prefs.. It is often
 misleading.


 But I found a bug, when I`m using whitelist, and header rcpt to:  have
 address with character description, whitelist failed to catch it in
 database. For example:

 1st message:
 spamd: clean message (-91.7/10.0) for t...@localdomain.com:501 in 8.2 
 seconds, 13522829 bytes.

 2nd message:
 spamd: identified spam (9.2/5.0) for some text t...@localdomain.com:501 in 
 8.3 seconds, 14874071 bytes.


Thank you!
Wbr,
Good luck with spam fight!




Re: Spam Assassin White List

2009-03-24 Thread John Hardin

On Mon, 23 Mar 2009, dsh979 wrote:

Q:How can I list items/users on a white list or a black list without 
the lists (and items) being the subject of further analysis by the 
SpamAssassin Rules


That has to be done outside SA. Basically (modulo shortcuts, which you 
shouldn't be playing with) SA always checks all rules against every 
message it processes.


If you want a hard whitelist or blacklist, then whatever is passing the 
messages from your MTA to SA for scoring (the glue layer) needs to 
implement that capability, and not give those messages to SA in the first 
place.


As others have said, _do not_ bounce (i.e. accept and then later send a 
failure-to-deliver message to the sender) spams. It is a given that spam 
is sent with a forged From address. If you bounce spams in this way, 
you're simply attacking some innocent third party - and this may result in 
_your_ MTA getting blacklisted.


If you want a hard blacklist, check the sender in your MTA _during SMTP_ 
and reject the message rather than accepting it. The typical way to do 
this is by configuring your MTA to check DNS blacklists, such as 
zen.spamhaus.org, and to add MTA rules rejecting specific senders or IP 
addresses.


Also: your MTA should not be accepting messages for invalid addresses. 
Those need to be rejected during the SMTP phase.


These particular questions are better directed at the cpanel list, as 
you're asking how do I reject emails with specific from_address, 
to_address or sender_IP during SMTP time? and how do I skip SA for 
specific from_address, to_address, sender_IP?


Best of luck.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Liberals love sex ed because it teaches kids to be safe around their
  sex organs. Conservatives love gun education because it teaches kids
  to be safe around guns. However, both believe that the other's
  education goals lead to dangers too terrible to contemplate.
---
 62 days since Obama's inauguration and still no unicorn!


RE: lookup user_prefs in SQL database ( not using spamc)

2009-03-24 Thread Bowie Bailey
Guido wrote:
 Hi,
 
 I am trying to configure my system that it can assign user specific
 scores. I therefore set up a table like described in [1]. This runs
 fine, as long as I use spamc to scan mails.
 
 But actually I want to use Amavisd-new using spamassassin. Here
 spamassassin complety ignores the sql settings. Other settings
 in the local.cf are threaded correctly.
 
  - How can I convince spamassassin (used by amavisd-new) to care
about my user_prefs in the database?

Amavisd-new scans everything as a single user.  It has no concept of
per-user settings.

 BTW: spamassassin should not try to search for user specific settings
 in user's home directorys. Not all all. How can I do that?

Amavisd-new will not look at the user's home directories.

-- 
Bowie


Re: Spam Assassin White List

2009-03-24 Thread Jeff Mincy
   From: Matus UHLAR - fantomas uh...@fantomas.sk
   Date: Tue, 24 Mar 2009 15:30:23 +0100
   
   On 23.03.09 21:58, dsh979 wrote:
I did not realise that items listed on the white list or the black list
would still be subject to the operation/analysis of the SpamAssassin 
Rules.  
   
   all rules are processed unless you play with ShortCircuit plugin. Beware of
   that: It may render the SA useless if you don't knwo what you are doing.
   
You have asked why I have set the required score the 100.  Lengthy
explanation (sorry).  I have done this to prevent SpamAssassin from
inserting SpamWarnings into the header/body of the relevant email.
   
   There's report_safe option to configure that.
   
Also rewrite_header 
   
Q:How can I list items/users on a white list or a black list without 
the
lists (and items) being the subject of further analysis by the SpamAssassin
Rules (and therefore obtaining the same score for each item on the relevant
list, irrespective of the operation of the SpamAssassin Rules, that is
-100=white list items  +100 = black list items)?
   
   I somehow do not understand this question.

He wants the white/black lists to run first and then short circuit.
So anybody in the whitelist gets a score of -100 and anybody in the
blacklist gets a score of +100.  This can probably be done with the
ShortCircuit plugin and setting the priority of the rules so that they
run first.

Black lists aren't all that useful for stopping spam.   The email
addresses are forged in spam.

-jeff


RE: lookup user_prefs in SQL database (not using spamc)

2009-03-24 Thread Guido
   - How can I convince spamassassin (used by amavisd-new) to care
 about my user_prefs in the database?
 
 Amavisd-new scans everything as a single user.  It has no concept of
 per-user settings.
 
What I mean is per recipient settings. 

And if that's the case, at least the default settings ($Global) should
work. And there is not even a try to connect to the database.

  BTW: spamassassin should not try to search for user specific settings
  in user's home directorys. Not all all. How can I do that?
 Amavisd-new will not look at the user's home directories.
No, but SA does.



Re: Trying to understand scoring discrepancy

2009-03-24 Thread mkellogg

facepalm/

That was indeed the case. Thank you.

Matt


Matus UHLAR - fantomas wrote:
 
 
 you apparently have a score line in user_prefs, or in system-wide
 directory, which prevails over those in SA rules dirs.
 
 

-- 
View this message in context: 
http://www.nabble.com/Trying-to-understand-scoring-discrepancy-tp22663873p22682989.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



RE: Server overload, queuing for SA possible?

2009-03-24 Thread Brian J. Murrell
On Tue, 2009-03-24 at 08:10 -0500, Bowie Bailey wrote:
 
 Your assessment sounds right to me.  I would make two suggestions.
 
 1) Memory is cheap these days.  Add some more RAM.

That's a mitigation strategy, yes, but it doesn't really answer OP's
question about how to make spamd stop trying to allocate new incoming
spams to try to process them all at the time they come in and instead
put them into a queue, in a effort to try to even the load out.

 2) Reduce the maximum children setting so that the system doesn't start
 swapping.  This will cause SA to scan faster and should result in fewer
 messages slipping through while SA is busy.

But it also means if the incoming load temporarily overruns the
available children currently available, then the excess doesn't get
spamd treatment.  Or does it?

If I have 5 spamd children available and (just to torture it) I fire off
50 spamc processes, what happens?

b.



signature.asc
Description: This is a digitally signed message part


Re: Using SpamAssassin for just the Bayesian filtering?

2009-03-24 Thread Justin Mason
hi --

This would indeed be possible -- just take the contents of the ruleset
dir in /usr/share/spamassassin , throw out most of it, and keep just
23_bayes.cf . Then when you run spamd, tell it to use that rules dir
instead of the default.

You should probably also add a 99_local.cf which contains a
bayes_path directive pointing at the custom bayes dbs for the word
list you're using.  then each server would have to use a different
rules dir, and you'd have multiple servers, one for each word list.

and also -- turn off bayes_auto_learn ;)

--j.

On Mon, Mar 23, 2009 at 23:11, Randy J. Ray rj...@corp.oodle.com wrote:
 Having gone over the FAQ and other doc-sections on the wiki, I haven't been
 able to answer my questions. So here's hoping the user-community can help!

 My company is currently using a home-brew solution for applying naive Bayes
 filtering to data. Currently, what we're doing is basically spam filtering
 on email messages that pass through our system. However, we have a need to
 do filtering on other content, filtering that isn't the same as
 spam-testing. In a nutshell, we currently use the bogofilter application
 to classify messages, and invoke it with different word-list files to
 represent different filtering requirements. But this isn't going to scale
 well for us as written, and I'm the lucky soul tasked with coming up with a
 better way.

 I'd like to adapt SA to this, if I can. I've used it in the past (and my ISP
 for my personal email is fiercely loyal to it), but only ever for basic
 email analysis. What I need, in this case, is a scalable Bayesian
 classifier. I see from the docs that using SA will get me a usable
 client/server model, which would take care of most of the scaling issues by
 making it easier for us to move the classifier to a dedicated machine (if
 needed, or at least a less-loaded one). What I *can't* puzzle out from the
 docs, is how to set up such a daemon to do *only* the Bayes part, not the
 rest of the typical spam checking (for one thing, these won't be email
 messages and thus will not have any SMTP headers at all). Also, I (we) would
 need to be able to either have the one daemon dynamically choose the
 database/word-list to use when judging a message, or run multiple instances
 that each look at a different db/word-list.

 Is this do-able with SA? I had hoped that there would be a more general
 solution around bogofilter, either a client/server application pair or a
 more API/library-based interface to calling it for training and for
 evaluation. But there isn't (not that I can find, anyway). And SA is a
 system with a long history and a solid code-base, so it seemed worthwhile to
 at least check and see if this was possible.

 Thanks in advance for any help, advice, etc.

 Randy
 --
 
 Randy J. Ray          Oodle, Inc.
  http://www.oodle.com
 rj...@corp.oodle.com




RE: lookup user_prefs in SQL database (not using spamc)

2009-03-24 Thread McDonald, Dan
On Tue, 2009-03-24 at 16:30 +0100, Guido wrote:
- How can I convince spamassassin (used by amavisd-new) to care
  about my user_prefs in the database?
  
  Amavisd-new scans everything as a single user.  It has no concept of
  per-user settings

You probably need to read the sql documentation for amavisd-new
http://www.ijs.si/software/amavisd/README.sql.txt
  
 What I mean is per recipient settings. 
 
 And if that's the case, at least the default settings ($Global) should
 work. And there is not even a try to connect to the database.

Have you restarted amavisd-new since you added the @lookup_sql_dsn?
 
   BTW: spamassassin should not try to search for user specific settings
   in user's home directorys. Not all all. How can I do that?
  Amavisd-new will not look at the user's home directories.
 No, but SA does.

But amavisd-new doesn't call SpamAssassin as an external.  It opens the
perl libraries and runs the same scoring code-base.  It behaves
differently than the spamc client...


-- 
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com



signature.asc
Description: This is a digitally signed message part


RE: Server overload, queuing for SA possible?

2009-03-24 Thread Bowie Bailey
Brian J. Murrell wrote:
 On Tue, 2009-03-24 at 08:10 -0500, Bowie Bailey wrote:
  
  Your assessment sounds right to me.  I would make two suggestions.
  
  1) Memory is cheap these days.  Add some more RAM.
 
 That's a mitigation strategy, yes, but it doesn't really answer OP's
 question about how to make spamd stop trying to allocate new incoming
 spams to try to process them all at the time they come in and instead
 put them into a queue, in a effort to try to even the load out.

Adding memory will allow more spamd children to run without causing
memory problems.  If you need to be able to handle a large influx of
mail, this may be the best solution.

Queuing the messages can be done, but if you get to a point where it is
taking 3 minutes to scan each message, the queue is going to continue to
grow almost indefinitely.

  2) Reduce the maximum children setting so that the system doesn't
  start swapping.  This will cause SA to scan faster and should
  result in fewer messages slipping through while SA is busy.
 
 But it also means if the incoming load temporarily overruns the
 available children currently available, then the excess doesn't get
 spamd treatment.  Or does it?

Excess messages will get passed through unscanned at the default
settings.  However, lowering the max children will prevent swapping and
allow SA to process the messages faster, so the bottleneck will not last
as long.  On a server with a steady mail load, you can get a condition
where a temporary spike in the number of incoming messages causes SA to
spawn more children and use up the available memory.  SA then slows down
considerably and is unable to catch up until you manually clear the
queue.  If max-children is set so that the system doesn't swap, then the
bottleneck will only last for a short time and only a few unscanned
messages will make it through.

There is a --no-safe-fallback option on spamc which will cause it to
exit with an error message in the case of any problems (Normally, it
always exits with a 0 exit status).  If you don't want anything to go
through unscanned, you can try this setting.

 If I have 5 spamd children available and (just to torture it) I fire
 off 50 spamc processes, what happens?

Each spamc process will try to connect to spamd three times at 1 second
intervals.  Any processes that fail to connect will let their message
through unscanned.  The number of retries and the sleep interval between
them can be configured on the spamc command line.

You can also try to prevent this by controlling how many concurrent
delivery threads are run by the server (assuming you are running SA at
delivery time).  This will allow the MTA to handle the queuing.

-- 
Bowie


spam assassin: default scores for URIBL_.._SURBL seem low to me

2009-03-24 Thread Dennis German

It seems to me that the default score of from 1.2 to 1.9,
 for messages originating from URIs which are Black listed
in any of the various JP,  AB, OB, PH, SC, ...  lists,
should be significantly higher, perhaps nearly the default
required score of 5.0

Some information is at http://ruleqa.spamassassin.org,
including the fact that
86% of URIBL_JP_SURBL hits also hit URIBL_OB_SURBL
66% of URIBL_JP_SURBL hits also hit URIBL_WS_SURBL
56% of URIBL_JP_SURBL hits also hit URIBL_AB_SURBL
etc
Is there a discussion of the tests and scores and philosophy
including but not limited to these somewhere?
Thanks,
Dennis German


Re: spam assassin: default scores for URIBL_.._SURBL seem low to me

2009-03-24 Thread John Hardin

On Tue, 24 Mar 2009, Dennis German wrote:


It seems to me that the default score of from 1.2 to 1.9,
for messages originating from URIs which are Black listed
in any of the various JP,  AB, OB, PH, SC, ...  lists,
should be significantly higher, perhaps nearly the default
required score of 5.0

Some information is at http://ruleqa.spamassassin.org,
including the fact that
86% of URIBL_JP_SURBL hits also hit URIBL_OB_SURBL
66% of URIBL_JP_SURBL hits also hit URIBL_WS_SURBL
56% of URIBL_JP_SURBL hits also hit URIBL_AB_SURBL


The default scores are assigned by analysis of the combinations of rule 
hits. See http://wiki.apache.org/spamassassin/HowScoresAreAssigned and the 
discussion of the Perceptron.


Assigning any given single rule a poison-pill score (or nearly so) 
is generally a bad idea.


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  An entitlement beneficiary is a person or special interest group
  who didn't earn your money, but demands the right to take your
  money because they *want* it.-- John McKay, _The Welfare State:
   No Mercy for the Middle Class_
---
 62 days since Obama's inauguration and still no unicorn!


Re: ruleset

2009-03-24 Thread Matt Kettler
JC Putter wrote:
 where can i find more rulesets? using openprotect sare rules and
 sought rulesets


That's about all there are... A few folks have odds and ends rules
posted on their webpages/blogs/etc, but they're of mixed quality.

Is there a particular reason your looking for more rulesets?




Re: Using SpamAssassin for just the Bayesian filtering?

2009-03-24 Thread Benny Pedersen

On Tue, March 24, 2009 01:54, mouss wrote:
 if you want to fight spam, ask open questions. SA is a good filter.
 Bayes isn't as perfect as you might think.

reminds me of 3660 secs for one hour :)

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Re: What is AWL: _Average-Whitelister_....

2009-03-24 Thread Linda Walsh



John Hardin wrote:

What is AWL rule? Why it gives so different amount of points?


Auto Whitelist is a misleading name. It is actually a score averager. 
Since the points it applies are based on the historical scoring from 
that sender, the score will vary by who the sender is and when the 
message is processed (i.e. their history to-date).

---

Thank you for the clear and simple explanation.

Perhaps:
AWL (AutomaticWhiteList)
should be renamed to:
AWL (Averaging-Whitelister)
NOT JOKING, despite humorous acronym constant

While the acronym would/could stay the same, the standard
expanded form should say it is an Averaging - something
(whitelister, blacklister, whatever).

The important point is not that it's automatically applied, but
that it's _A_veraging...

This clarifies this long outstanding Q for me as well.

Thanks!
Linda


Re: Spam Assassin White List

2009-03-24 Thread Benny Pedersen

On Tue, March 24, 2009 03:34, dsh979 wrote:

 blacklist_from *...@blacklist1.com
 blacklist_from *...@blacklist2.com
 blacklist_from *...@blacklist3.com
 required_score 100
 whitelist_from *...@whitelist1.com
 whitelist_from *...@whitelist2.com
 whitelist_from *...@whitelist3.com

forged senders welcome :)

hope *_from will be removed in next sa, its the badest check in
current sa of all tests :/

change to whitelist_auth rules

-- 
http://localhost/ 100% uptime and 100% mirrored :)



Spam from windows live

2009-03-24 Thread jcputter
i am receiving spam all the time from windows live accounts, spamassassin 
doesnt even have one hit.. i am using sought rule with openprotects sare rules 
with dcc,pyzor,razor2 and iXhash.

i create a rule to stop spam containing windows live spaces but spam like this 
one doesnt even get a hit.

here is a raw header of a mail

Return-Path: ethelindkjbhjydkh...@live.com
X-Original-To: jcput...@centreweb.co.za
Delivered-To: jcput...@centreweb.co.za
Received: from mail.centreweb.co.za (localhost [127.0.0.1])
by office.numata.local (Postfix) with ESMTP id 516E24BDB4
for jcput...@centreweb.co.za; Tue, 24 Mar 2009 19:43:29 +0200 (SAST)
X-Original-To: jcput...@centreweb.co.za
Received: from bay0-omc1-s25.bay0.hotmail.com (bay0-omc1-s25.bay0.hotmail.com 
[65.54.246.97])
by mail.centreweb.co.za (Postfix) with ESMTP id ACDD1160796
for jcput...@centreweb.co.za; Tue, 24 Mar 2009 23:31:34 +0200 (SAST)
Received: from BAY102-W23 ([64.4.61.123]) by bay0-omc1-s25.bay0.hotmail.com 
with Microsoft SMTPSVC(6.0.3790.3959);
 Tue, 24 Mar 2009 14:31:37 -0700
Message-ID: bay102-w23b420165a9a8e15cefadda9...@phx.gbl
Content-Type: multipart/alternative;
boundary=_6a0f2882-1775-43b5-9655-4147fe68795d_
X-Originating-IP: [92.48.45.254]
From: drake ethelind ethelindkjbhjydkh...@live.com
To: appe...@gmail.com, jcput...@centreweb.co.za
Subject: Hot teen deep f: uc-king giant dog c:o ck
Date: Tue, 24 Mar 2009 21:31:38 +
Importance: Normal
MIME-Version: 1.0
X-OriginalArrivalTime: 24 Mar 2009 21:31:37.0912 (UTC) 
FILETIME=[E9E1AB80:01C9ACC7]
X-numata_local-MailScanner-ID: 516E24BDB4.877C7
X-numata_local-MailScanner: Found to be clean
X-numata_local-MailScanner-From: ethelindkjbhjydkh...@live.com
X-Spam-Status: No




Re: Spam Assassin White List

2009-03-24 Thread RW
On Wed, 25 Mar 2009 01:35:53 +0100 (CET)
Benny Pedersen m...@junc.org wrote:

 
 On Tue, March 24, 2009 03:34, dsh979 wrote:
 

  whitelist_from *...@whitelist3.com
 
 forged senders welcome :)
 
 hope *_from will be removed in next sa, its the badest check in
 current sa of all tests :/
 
 change to whitelist_auth rules

I think they all have their place. Clearly you'd want to use
whitelist_auth for the likes of paypal, but whitelist_from is almost
certainly a better choice when there is a low probability of forgery,
e.g. one obscure company whitelisting another.


Re: Spam from windows live

2009-03-24 Thread Benny Pedersen

On Wed, March 25, 2009 01:59, jcput...@centreweb.co.za wrote:
 i am receiving spam all the time from windows live accounts,
 spamassassin doesnt even have one hit.. i am using sought rule with
 openprotects sare rules with dcc,pyzor,razor2 and iXhash.

 i create a rule to stop spam containing windows live spaces but spam
 like this one doesnt even get a hit.

 here is a raw header of a mail

where is the raw body ?


 Return-Path: ethelindkjbhjydkh...@live.com
 X-Original-To: jcput...@centreweb.co.za
 Delivered-To: jcput...@centreweb.co.za
 Received: from mail.centreweb.co.za (localhost [127.0.0.1])
   by office.numata.local (Postfix) with ESMTP id 516E24BDB4
   for jcput...@centreweb.co.za; Tue, 24 Mar 2009 19:43:29 +0200
 (SAST)
 X-Original-To: jcput...@centreweb.co.za
 Received: from bay0-omc1-s25.bay0.hotmail.com
 (bay0-omc1-s25.bay0.hotmail.com [65.54.246.97])
   by mail.centreweb.co.za (Postfix) with ESMTP id ACDD1160796
   for jcput...@centreweb.co.za; Tue, 24 Mar 2009 23:31:34 +0200
 (SAST)
 Received: from BAY102-W23 ([64.4.61.123]) by
 bay0-omc1-s25.bay0.hotmail.com with Microsoft
 SMTPSVC(6.0.3790.3959);
Tue, 24 Mar 2009 14:31:37 -0700
 Message-ID: bay102-w23b420165a9a8e15cefadda9...@phx.gbl
 Content-Type: multipart/alternative;
   boundary=_6a0f2882-1775-43b5-9655-4147fe68795d_
 X-Originating-IP: [92.48.45.254]
 From: drake ethelind ethelindkjbhjydkh...@live.com
 To: appe...@gmail.com, jcput...@centreweb.co.za
 Subject: Hot teen deep f: uc-king giant dog c:o ck
 Date: Tue, 24 Mar 2009 21:31:38 +
 Importance: Normal
 MIME-Version: 1.0
 X-OriginalArrivalTime: 24 Mar 2009 21:31:37.0912 (UTC)
 FILETIME=[E9E1AB80:01C9ACC7]
 X-numata_local-MailScanner-ID: 516E24BDB4.877C7
 X-numata_local-MailScanner: Found to be clean
 X-numata_local-MailScanner-From: ethelindkjbhjydkh...@live.com
 X-Spam-Status: No

from the above i would reject senders from live.com or check spf

if spf pass, then reject the friend sender :)

-- 
http://localhost/ 100% uptime and 100% mirrored :)



New kind of spam

2009-03-24 Thread Jack Raats

Today I received two messages with a kinds of new(?) spam.
The messages, html ones, contained the word viagra made by colouring cells 
in a table.
The message also contained a link to a blog (live.com). The rest of the 
message contained a text to mislead the bayes filtering.


How to stop these messages? By disallowing html messages???

Jack