Re: Questions about SA

2009-10-05 Thread Igor Bogomazov
 On Fri, 2 Oct 2009, Jose Luis Marin Perez wrote:
 
  - Approximately 85% of spam are in Spanish, this can be a problem
  for SpamAssassin?
 
 Possibly. Most of the default rules and most third-party rules are
 for English. This would tend to reduce your hit rate, but a
 properly-trained Bayes would help correct that.
 
 I don't know if anybody is generating third-party rules for 
 spanish-language spam...
 

I use SA for filtering Russian spam, and I have no complaints.

-- 
С уважением,

Igor Bogomazov
Игорь Богомазов
Главный технический специалист
HighLink Ltd. St-Petersburg, Russia
8(812)334-12-12 [доб. 220]
8(963)344-44-38 (Билайн)
http://www.hl.ru



signature.asc
Description: PGP signature


Re: Questions about SA

2009-10-03 Thread Matus UHLAR - fantomas
On 02.10.09 09:32, Jose Luis Marin Perez wrote:
  - How to calculate the amount of memory and CPU used by each process Spamd? 

very hardly without running multiple spamd processes. It also depends on
sizes of mails checked.
-- 
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
LSD will make your ECS screen display 16.7 million colors


Questions about SA

2009-10-02 Thread Jose Luis Marin Perez

I have some questions: 

 - How to calculate the amount of memory and CPU used by each process Spamd? 
 - Approximately 85% of spam are in Spanish, this can be a problem for 
SpamAssassin? 
 - Which tool can I use to get statistics of SpamAssassin, I am currently using 
the script sa-stats.pl.

Thanks

Jose Luis 
_
Connect to the next generation of MSN Messenger 
http://imagine-msn.com/messenger/launch80/default.aspx?locale=en-ussource=wlmailtagline

Re: Questions about SA

2009-10-02 Thread John Hardin

On Fri, 2 Oct 2009, Jose Luis Marin Perez wrote:


- Approximately 85% of spam are in Spanish, this can be a problem for
  SpamAssassin?


Possibly. Most of the default rules and most third-party rules are for 
English. This would tend to reduce your hit rate, but a properly-trained 
Bayes would help correct that.


I don't know if anybody is generating third-party rules for 
spanish-language spam...



- Which tool can I use to get statistics of SpamAssassin, I am currently
  using the script sa-stats.pl.


sa-stats.pl is a good tool to get your local rule performance.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Gun Control laws cannot reduce violent crime, because gun control
  laws assume a violent criminal will obey the law.
---
 Approximately 9085920 firearms legally purchased in the U.S. this year


Questions about sa-learn and report_safe encapsulation

2005-09-14 Thread Nels Lindquist
Hi there.

I'm trying to set up an IMAP based bayesian learner using the 
instructions in the SA wiki for RemoteIMAPFolder, etc.

I'm diverting messages to the IMAP mailstore from MIMEDefang, and I'm 
trying to set up MIMEDefang to replicate SA's report_safe 
encapsulation format so that sa-learn only learns the encapsulated 
message while ignoring the included SA report, etc.

I appear to have done something wrong, however.  Following the 
instructions in the wiki, I have fetchmail snagging messages from the 
appropriate IMAP folder and feeding them to sa-learn, but sa-learn 
doesn't appear to be properly detecting the message encapsulation.

As far as I can tell from looking at the code, sa-learn does a check 
for the existence of the X-Spam-Checker-Version header to decide 
whether or not to call remove_spamassassin_markup().  Within that 
subroutine it checks for a Content-Type header matching a regexp 
which includes multipart/mixed; and some other things I don't quite 
follow. :-)

As far as I can tell, though, the messages aren't being detected as 
encapsulated--I'm using the -D flag with sa-learn and Removing 
Markup never shows up in the dbg messages I expect from the code in 
remove_spamassassin_markup(), and the debug messages show URLs being 
parsed which are only present in the spamassassin report included in 
the body text, but not in the encapsulated message itself.

Is there some other trick that I'm missing while generating a message 
that sa-learn will recognize as report_safe encapsulated?

Thanks!

Working with SA 3.10rc1, by the way.


Nels Lindquist *
Information Systems Manager
Morningstar Air Express Inc.



Questions about sa-learn and

2005-09-14 Thread Nels Lindquist
Hi there.

I'm trying to set up an IMAP based bayesian learner using the 
instructions in the SA wiki for RemoteIMAPFolder, etc.

I'm diverting messages to the IMAP mailstore from MIMEDefang, and I'm 
trying to set up MIMEDefang to replicate SA's report_safe 
encapsulation format so that sa-learn only learns the encapsulated 
message while ignoring the included SA report, etc.

I appear to have done something wrong, however.  Following the 
instructions in the wiki, I have fetchmail snagging messages from the 
appropriate IMAP folder and feeding them to sa-learn, but sa-learn 
doesn't appear to be properly detecting the message encapsulation.

As far as I can tell from looking at the code, sa-learn does a check 
for the existence of the X-Spam-Checker-Version header to decide 
whether or not to call remove_spamassassin_markup().  Within that 
subroutine it checks for a Content-Type header matching a regexp 
which includes multipart/mixed; and some other things I don't quite 
follow. :-)

As far as I can tell, though, the messages aren't being detected as 
encapsulated--I'm using the -D flag with sa-learn and Removing 
Markup never shows up in the dbg messages I expect from the code in 
remove_spamassassin_markup(), and the debug messages show URLs being 
parsed which are only present in the spamassassin report included in 
the body text, but not in the encapsulated message itself.

Is there some other trick that I'm missing while generating a message 
that sa-learn will recognize as report_safe encapsulated?

Thanks!

Working with SA 3.10rc1, by the way.


Nels Lindquist *
Information Systems Manager
Morningstar Air Express Inc.