Re: Which is more efficient: two regexp's or one regexp with alternation?

2007-01-17 Thread Justin Mason

Theo Van Dinter writes:
 On Tue, Jan 16, 2007 at 06:23:48PM -0700, Kelly Jones wrote:
  If I want to block subjects matching foo or bar, is it more
  efficient to write two regexps or a single foo|bar regexp?
 
 It'll be more efficient to do a single regexp.
 
  Or are there multiple rules because the scores are different and have
  to be optimized differently?
 
 Yes.  Rules have different hit rates, so they need to get different
 scores because it's better for overall efficacy.

Also, a single regexp matches only once.  We want to know not only
that *one* of our rules matched, but also *which* rules, and
in what combination.

--j.


AWL question

2007-01-17 Thread Rocco Scappatura
Hello,

I use SA storing data on MySQL databases.

I have seen the awl contains email address with the value 'none' in the
field 'IP'.

Why this field for some entriesis not correctly filled?

Thanks,

rocsca


Expiring tokens in SA database

2007-01-17 Thread Rocco Scappatura
Hello,

I'm using SA with MySQL.

I have to Amavisd-new server, each talking with a different MySQL
server.

I run every night regularly this command:

sa-learn --sync --force-expire

for datbase maintaining.

I have noticed that on the first the 'bayes_token' table occupies always
about 1GB and the size never decrease even after I execute the command
above (se the output in the file attached), while on the second database
the same table occupies less space (about 250 MB).

It seems to me the the expiring doesn't works at all and I can't figure
out why.

Can sombody give an explanation?

TIA,

rocsca


sa-learn.out
Description: sa-learn.out


Minor FP on ham with logo - 5.173

2007-01-17 Thread Kevin Golding
I've had a few FPs on a legitimate mail from someone who apparently
enjoys large fonts and a logo.  I'm using network tests but not bayes
and these are the stock rules being hit along with their scores from sa-
update:

EXTRA_MPART_TYPE0.815
DK_POLICY_SIGNSOME  0.001
TVD_FW_GRAPHIC_ID1  2.100
HTML_MESSAGE0.001
HTML_FONT_BIG   0.256
PART_CID_STOCK  1.000
PART_CID_STOCK_LESS 1.000

Total:  5.173

In stock install it's just over the default threshold of 5 so if someone
wants samples for a ham corpus I can provide a couple.

Fwiw their mailer is Microsoft Outlook Express 6.00.2900.3028

Kevin


URIBL

2007-01-17 Thread Jon Bjorn Njalsson
Is it possible to have SA find URL in a mail and lookup the ipaddress
for the URL and check if that ipaddress is listed in some rbl zone and
score acordingly.

Example, I reveice lot of spam containing URL like
http://www.thesillyguy.info or thenopers.info and these sites all
resolve to the same ipaddress 216.40.47.17. Instead of writing rules
based on these sites is it possible to write a rule based on the
ipaddress ?





Re: Expiring tokens in SA database

2007-01-17 Thread Nigel Frankcom
On Wed, 17 Jan 2007 11:25:02 +0100, Rocco Scappatura
[EMAIL PROTECTED] wrote:

Hello,

I'm using SA with MySQL.

I have to Amavisd-new server, each talking with a different MySQL
server.

I run every night regularly this command:

sa-learn --sync --force-expire

for datbase maintaining.

I have noticed that on the first the 'bayes_token' table occupies always
about 1GB and the size never decrease even after I execute the command
above (se the output in the file attached), while on the second database
the same table occupies less space (about 250 MB).

It seems to me the the expiring doesn't works at all and I can't figure
out why.

Can sombody give an explanation?

TIA,

rocsca


Do you compact the database afterwards?

Nigel


RE: URIBL

2007-01-17 Thread Martin.Hepworth
Jon

Yes this functionality has been built in since SA version 3.0 (and via
an additional 'plugin' since 2.6.?4?).

Make sure you are using network tests, Net::DNS perl module is installed
and the URI-RBL plugin is enabled in the *.pre files which are located
in the same place as local.cf (normally /etc/mail/spamassassin).


--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300

 -Original Message-
 From: Jon Bjorn Njalsson [mailto:[EMAIL PROTECTED]
 Sent: 17 January 2007 10:25
 To: users@spamassassin.apache.org
 Subject: URIBL

 Is it possible to have SA find URL in a mail and lookup the ipaddress
 for the URL and check if that ipaddress is listed in some rbl zone and
 score acordingly.

 Example, I reveice lot of spam containing URL like
 http://www.thesillyguy.info or thenopers.info and these sites all
 resolve to the same ipaddress 216.40.47.17. Instead of writing rules
 based on these sites is it possible to write a rule based on the
 ipaddress ?







**
Confidentiality : This e-mail and any attachments are intended for the 
addressee only and may be confidential. If they come to you in error 
you must take no action based on them, nor must you copy or show them 
to anyone. Please advise the sender by replying to this e-mail 
immediately and then delete the original from your computer.

Opinion : Any opinions expressed in this e-mail are entirely those of 
the author and unless specifically stated to the contrary, are not 
necessarily those of the author's employer.

Security Warning : Internet e-mail is not necessarily a secure 
communications medium and can be subject to data corruption. We advise 
that you consider this fact when e-mailing us. 

Viruses : We have taken steps to ensure that this e-mail and any 
attachments are free from known viruses but in keeping with good 
computing practice, you should ensure that they are virus free.

Red Lion 49 Ltd T/A Solid State Logic
Registered as a limited company in England and Wales 
(Company No:5362730)
Registered Office: 25 Spring Hill Road, Begbroke, Oxford OX5 1RU, 
United Kingdom
**



RE: Expiring tokens in SA database

2007-01-17 Thread Rocco Scappatura
 Do you compact the database afterwards?
 
 Nigel

No. How I have to do?

rocsca


Re: Expiring tokens in SA database

2007-01-17 Thread Nigel Frankcom
On Wed, 17 Jan 2007 13:00:37 +0100, Rocco Scappatura
[EMAIL PROTECTED] wrote:

 Do you compact the database afterwards?
 
 Nigel

No. How I have to do?

rocsca

From the CL use something like this:

mysql -u root --password=yourpassword -e USE spamassassin;OPTIMIZE
TABLE awl, bayes_expire, bayes_seen, bayes_token, bayes_vars;

Your tables may differ slightly from mine, and some may have no
content at all; initially try compacting the one that's biggest.

KR

Nigel


Re: restart sequence after changing configuration

2007-01-17 Thread Mike Kenny

Thank guys. This is helpful.

On 1/16/07, Robert Brooks [EMAIL PROTECTED] wrote:


Mike Kenny wrote:
 As I understand the situation when using amavis/spamassassin/postfix the
 flow of a messages is that it is received by postfix, passed to amavisd,
 from there to clamav, then to spamassassin and then back to postfix. My
 questions are:

 1. is this correct?
 2. if I change spamassasin's local.cf http://local.cf or user_prefs,
 what needs to be restarted and in what sequence/
 3. if I change amavisd.conf what needs restarting/reloading, etc and in
 what sequence?

amavisd-new loads the spamassassin perl modules itself, if you are
running amavisd-new you don't need to run spamd.

Restarting amavisd-new will cause local.cf to be re-read, however much
of user_prefs is not used by amavis, so you should look at the settings
in amavisd.conf first.

You can restart amavisd-new without restarting postfix.


--
Robert Brooks,   Network Manager,  Cable  Wireless UK
[EMAIL PROTECTED]   http://wtg.cw.com/
Tel: +44 (0)20 7339 8600  Fax: +44 (0)20 7339 8601
-  What was your username again? - BOFH-



Re: AWL question

2007-01-17 Thread Magnus Holmgren
On Wednesday 17 January 2007 11:24, Rocco Scappatura wrote:
 I use SA storing data on MySQL databases.

 I have seen the awl contains email address with the value 'none' in the
 field 'IP'.

 Why this field for some entriesis not correctly filled?

Perhaps it could be that mail was submitted locally (not with SMTP), over IPv6 
or that the IP address couldn't be extracted for some other reason.

-- 
Magnus Holmgren[EMAIL PROTECTED]
   (No Cc of list mail needed, thanks)

  Exim is better at being younger, whereas sendmail is better for 
   Scrabble (50 point bonus for clearing your rack) -- Dave Evans


pgpOeXaErVFbr.pgp
Description: PGP signature


RE: AWL question

2007-01-17 Thread Rocco Scappatura
Thanks for your answer,

  I have seen the awl contains email address with the value 'none' in 
  the field 'IP'.
 
  Why this field for some entriesis not correctly filled?
 
 Perhaps it could be that mail was submitted locally (not with 
 SMTP), over IPv6 or that the IP address couldn't be extracted 
 for some other reason.

No the email is not submited locally and over TCP. So I think that is
the second reason you have said.. But why the IP could not be
exctracted? (I have many such cases!!!)

BR,

rocsca


RE: Expiring tokens in SA database

2007-01-17 Thread Rocco Scappatura
Hello,

  Do you compact the database afterwards?
  
  Nigel
 
 No. How I have to do?
 
 rocsca
 
 From the CL use something like this:
 
 mysql -u root --password=yourpassword -e USE 
 spamassassin;OPTIMIZE TABLE awl, bayes_expire, bayes_seen, 
 bayes_token, bayes_vars;
 
 Your tables may differ slightly from mine, and some may have 
 no content at all; initially try compacting the one that's biggest.
 

Infact, that was the problem!!

Many thanks,

rocsca


Blacklisting efficiently using first and final rules?

2007-01-17 Thread Kelly Jones

Blacklisting with SpamAssassin is easy: just add a rule with a high score.

However, this seems inefficient, since SpamAssassin will still go
through its entire ruleset to calculate a score.

Is it possible to setup first and final rules in SpamAssassin. That
is, rules that are: 1) checked before any other rules, and 2) if any
of the rules are hit, no other rules are checked.

Given that some tests (especially network-based tests?) are expensive
and that most people get a lot of spam, I think this would help a lot,
but I could be wrong: any thoughts on whether this would really
increase efficiency?

Of course, first and final rules could be used for whitelisting, too.

--
We're just a Bunch Of Regular Guys, a collective group that's trying
to understand and assimilate technology. We feel that resistance to
new ideas and technology is unwise and ultimately futile.


PLease help

2007-01-17 Thread MIS
Hi,

I am newbie to this list and quite new to the difficulties in fight spam. 
Please accept my apologies if sending a copy spam message to the list is not 
acceptable etiquette

We have started to receive quite a lot of spam in this type of form with an 
embedded stock or meds image., None of my rules are hitting it. Can anyone 
please offer me some help so that I can get to the bottom of what is becoming a 
very time consuming process. I use Spam Assassin 2.64. Perhaps the answer is 
that it is time to upgrade.

Thanks

Bob

Example...

which is headed, than as that of an animal, for the animal does speak of an 
action or a process as lengthy, because the time covered qualities. It is 
evident that these are qualities, for those things A line, on the other hand, 
is a continuous quantity, for it is

genteel and insinuating: he waved his hands plausibly as he went, and of which 
it is a half. Similarly the existence of a master to withstand disintegration; 
softness, again, is predicated of a thing to be correlative with another, and 
the terminology used is correct,
particular attitudes, but attitude is itself a relative term. To colour; 
justice and injustice, to contrary genera, virtue and vice; Thus habit differs 
from disposition in this, that while the latter that may be, it is an 
incontrovertible fact that the things which in
more of the same

Re: Blacklisting efficiently using first and final rules?

2007-01-17 Thread Theo Van Dinter
On Wed, Jan 17, 2007 at 07:25:24AM -0700, Kelly Jones wrote:
 Is it possible to setup first and final rules in SpamAssassin. That
 is, rules that are: 1) checked before any other rules, and 2) if any
 of the rules are hit, no other rules are checked.

1) Absolutely, see priority in the docs.
2) Not with the currently released code.  3.2 will have short-circuit
   functionality to do this..

-- 
Randomly Selected Tagline:
He has his own unique version of the English language...


pgpwhADvSxmyB.pgp
Description: PGP signature


Re: Blacklisting efficiently using first and final rules?

2007-01-17 Thread Justin Mason

Yes, this is in 3.2.0.

Kelly Jones writes:
 Blacklisting with SpamAssassin is easy: just add a rule with a high score.
 
 However, this seems inefficient, since SpamAssassin will still go
 through its entire ruleset to calculate a score.
 
 Is it possible to setup first and final rules in SpamAssassin. That
 is, rules that are: 1) checked before any other rules, and 2) if any
 of the rules are hit, no other rules are checked.
 
 Given that some tests (especially network-based tests?) are expensive
 and that most people get a lot of spam, I think this would help a lot,
 but I could be wrong: any thoughts on whether this would really
 increase efficiency?
 
 Of course, first and final rules could be used for whitelisting, too.
 
 -- 
 We're just a Bunch Of Regular Guys, a collective group that's trying
 to understand and assimilate technology. We feel that resistance to
 new ideas and technology is unwise and ultimately futile.


Re: Blacklisting efficiently using first and final rules?

2007-01-17 Thread Theo Van Dinter
On Wed, Jan 17, 2007 at 09:34:10AM -0500, Theo Van Dinter wrote:
  Is it possible to setup first and final rules in SpamAssassin. That
  is, rules that are: 1) checked before any other rules, and 2) if any
  of the rules are hit, no other rules are checked.
 
 1) Absolutely, see priority in the docs.
 2) Not with the currently released code.  3.2 will have short-circuit
functionality to do this..

3) the best way to accomplish this (even w/ 3.2) is to not send mails that
   you want to whitelist/blacklist into SpamAssassin at all.  do the decision
   outside of SA and don't call it for mails that meet your criteria.

-- 
Randomly Selected Tagline:
For Sale: Dehydrated H2O - $14 per quart


pgpzdQjyKxJ3T.pgp
Description: PGP signature


Re: PLease help

2007-01-17 Thread uxbod
Upgrade and use FuzzyOCR (http://fuzzyocr.own-hero.net/)

On Wed, 17 Jan 2007 14:30:07 -, MIS [EMAIL PROTECTED] wrote:
 Hi,
 
 I am newbie to this list and quite new to the difficulties in fight spam.
 Please accept my apologies if sending a copy spam message to the list is
 not acceptable etiquette
 
 We have started to receive quite a lot of spam in this type of form with
 an embedded stock or meds image., None of my rules are hitting it. Can
 anyone please offer me some help so that I can get to the bottom of what is
 becoming a very time consuming process. I use Spam Assassin 2.64. Perhaps
 the answer is that it is time to upgrade.
 
 Thanks
 
 Bob
 
 Example...
 
 which is headed, than as that of an animal, for the animal does speak of
 an action or a process as lengthy, because the time covered qualities. It
 is evident that these are qualities, for those things A line, on the other
 hand, is a continuous quantity, for it is
 
 genteel and insinuating: he waved his hands plausibly as he went, and of
 which it is a half. Similarly the existence of a master to withstand
 disintegration; softness, again, is predicated of a thing to be correlative
 with another, and the terminology used is correct,
 particular attitudes, but attitude is itself a relative term. To colour;
 justice and injustice, to contrary genera, virtue and vice; Thus habit
 differs from disposition in this, that while the latter that may be, it is
 an incontrovertible fact that the things which in
 more of the same

-- 
--[ UxBoD ]--
// PGP Key: curl -s http://www.splatnix.net/uxbod.asc | gpg --import
// Fingerprint: 543A E778 7F2D 98F1 3E50 9C1F F190 93E0 E8E8 0CF8
// Keyserver: www.keyserver.net Key-ID: 0xE8E80CF8


-- 
This message has been scanned for viruses and dangerous content by MailScanner, 
and is
believed to be clean.



RE: URIBL

2007-01-17 Thread Jon Bjorn Njalsson
I have Net::DNS module installed.


[14934] dbg: dns: is Net::DNS::Resolver available? yes
[14934] dbg: dns: Net::DNS version: 0.57

and 

[14934] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from
@INC
[14934] dbg: plugin: registered
Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa5822f0)

and in v310.pre i have

loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

but still spam regarding penis enlargement (some) are getting through.

any other ideas ?



On mið, 2007-01-17 at 11:32 +, Martin.Hepworth wrote:
 Jon
 
 Yes this functionality has been built in since SA version 3.0 (and via
 an additional 'plugin' since 2.6.?4?).
 
 Make sure you are using network tests, Net::DNS perl module is installed
 and the URI-RBL plugin is enabled in the *.pre files which are located
 in the same place as local.cf (normally /etc/mail/spamassassin).
 
 
 --
 Martin Hepworth
 Snr Systems Administrator
 Solid State Logic
 Tel: +44 (0)1865 842300
 
  -Original Message-
  From: Jon Bjorn Njalsson [mailto:[EMAIL PROTECTED]
  Sent: 17 January 2007 10:25
  To: users@spamassassin.apache.org
  Subject: URIBL
 
  Is it possible to have SA find URL in a mail and lookup the ipaddress
  for the URL and check if that ipaddress is listed in some rbl zone and
  score acordingly.
 
  Example, I reveice lot of spam containing URL like
  http://www.thesillyguy.info or thenopers.info and these sites all
  resolve to the same ipaddress 216.40.47.17. Instead of writing rules
  based on these sites is it possible to write a rule based on the
  ipaddress ?
 
 
 
 
 
 
 
 **
 Confidentiality : This e-mail and any attachments are intended for the 
 addressee only and may be confidential. If they come to you in error 
 you must take no action based on them, nor must you copy or show them 
 to anyone. Please advise the sender by replying to this e-mail 
 immediately and then delete the original from your computer.
 
 Opinion : Any opinions expressed in this e-mail are entirely those of 
 the author and unless specifically stated to the contrary, are not 
 necessarily those of the author's employer.
 
 Security Warning : Internet e-mail is not necessarily a secure 
 communications medium and can be subject to data corruption. We advise 
 that you consider this fact when e-mailing us. 
 
 Viruses : We have taken steps to ensure that this e-mail and any 
 attachments are free from known viruses but in keeping with good 
 computing practice, you should ensure that they are virus free.
 
 Red Lion 49 Ltd T/A Solid State Logic
 Registered as a limited company in England and Wales 
 (Company No:5362730)
 Registered Office: 25 Spring Hill Road, Begbroke, Oxford OX5 1RU, 
 United Kingdom
 **
 



RE: URIBL

2007-01-17 Thread Martin.Hepworth
Jon

Dcc, pyzor (using a working server) and razor are useful..

--
Martin Hepworth
Snr Systems Administrator
Solid State Logic
Tel: +44 (0)1865 842300

 -Original Message-
 From: Jon Bjorn Njalsson [mailto:[EMAIL PROTECTED]
 Sent: 17 January 2007 14:47
 To: Martin.Hepworth
 Cc: users@spamassassin.apache.org
 Subject: RE: URIBL

 I have Net::DNS module installed.


 [14934] dbg: dns: is Net::DNS::Resolver available? yes
 [14934] dbg: dns: Net::DNS version: 0.57

 and

 [14934] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from
 @INC
 [14934] dbg: plugin: registered
 Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa5822f0)

 and in v310.pre i have

 loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

 but still spam regarding penis enlargement (some) are getting through.

 any other ideas ?



 On mið, 2007-01-17 at 11:32 +, Martin.Hepworth wrote:
  Jon
 
  Yes this functionality has been built in since SA version 3.0 (and
via
  an additional 'plugin' since 2.6.?4?).
 
  Make sure you are using network tests, Net::DNS perl module is
installed
  and the URI-RBL plugin is enabled in the *.pre files which are
located
  in the same place as local.cf (normally /etc/mail/spamassassin).
 
 
  --
  Martin Hepworth
  Snr Systems Administrator
  Solid State Logic
  Tel: +44 (0)1865 842300
 
   -Original Message-
   From: Jon Bjorn Njalsson [mailto:[EMAIL PROTECTED]
   Sent: 17 January 2007 10:25
   To: users@spamassassin.apache.org
   Subject: URIBL
  
   Is it possible to have SA find URL in a mail and lookup the
ipaddress
   for the URL and check if that ipaddress is listed in some rbl zone
and
   score acordingly.
  
   Example, I reveice lot of spam containing URL like
   http://www.thesillyguy.info or thenopers.info and these sites all
   resolve to the same ipaddress 216.40.47.17. Instead of writing
rules
   based on these sites is it possible to write a rule based on the
   ipaddress ?
  
  
 
 
 
 
 
 
**
  Confidentiality : This e-mail and any attachments are intended for
the
  addressee only and may be confidential. If they come to you in error
  you must take no action based on them, nor must you copy or show
them
  to anyone. Please advise the sender by replying to this e-mail
  immediately and then delete the original from your computer.
 
  Opinion : Any opinions expressed in this e-mail are entirely those
of
  the author and unless specifically stated to the contrary, are not
  necessarily those of the author's employer.
 
  Security Warning : Internet e-mail is not necessarily a secure
  communications medium and can be subject to data corruption. We
advise
  that you consider this fact when e-mailing us.
 
  Viruses : We have taken steps to ensure that this e-mail and any
  attachments are free from known viruses but in keeping with good
  computing practice, you should ensure that they are virus free.
 
  Red Lion 49 Ltd T/A Solid State Logic
  Registered as a limited company in England and Wales
  (Company No:5362730)
  Registered Office: 25 Spring Hill Road, Begbroke, Oxford OX5 1RU,
  United Kingdom
 
**
 





**
Confidentiality : This e-mail and any attachments are intended for the 
addressee only and may be confidential. If they come to you in error 
you must take no action based on them, nor must you copy or show them 
to anyone. Please advise the sender by replying to this e-mail 
immediately and then delete the original from your computer.

Opinion : Any opinions expressed in this e-mail are entirely those of 
the author and unless specifically stated to the contrary, are not 
necessarily those of the author's employer.

Security Warning : Internet e-mail is not necessarily a secure 
communications medium and can be subject to data corruption. We advise 
that you consider this fact when e-mailing us. 

Viruses : We have taken steps to ensure that this e-mail and any 
attachments are free from known viruses but in keeping with good 
computing practice, you should ensure that they are virus free.

Red Lion 49 Ltd T/A Solid State Logic
Registered as a limited company in England and Wales 
(Company No:5362730)
Registered Office: 25 Spring Hill Road, Begbroke, Oxford OX5 1RU, 
United Kingdom
**



phoney habeas signature?

2007-01-17 Thread Michael Scheidell
if this is phoney habeas, I propose a signature to detect it (the web
link does not exist!, isn't it phoney?)
(ymmv)

Accreditor: Habeas
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to http://www.habeas.com/report/.


Postfix, pcre: header_checks:
/^X-Habeas-SWE-9: mark in spam to http:\/\/www\.habeas\.com\/report\//
REJECT Forged Habeas

SA,  (local rules? local.cf?)

header FORGED_HABEAS_RPT X-Habeas-SWE-9  =~ m'http://www.habeas.com/report/`
describe FORGED_HABEAS_RPT  Forged to look like Habeas, 'report' site
doesn't exist
score FORGED_HABEAS_RPT 7

-- 
Michael Scheidell, CTO
SECNAP Network Security / www.secnap.com
[EMAIL PROTECTED]  / 1+561-999-5000, x 1131


- 
This email has been scanned and certified safe by SpammerTrap(tm) 
For Information please see http://www.spammertrap.com


Re: URIBL

2007-01-17 Thread Nigel Frankcom
On Wed, 17 Jan 2007 14:46:36 +, Jon Bjorn Njalsson
[EMAIL PROTECTED] wrote:

I have Net::DNS module installed.


[14934] dbg: dns: is Net::DNS::Resolver available? yes
[14934] dbg: dns: Net::DNS version: 0.57

and 

[14934] dbg: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from
@INC
[14934] dbg: plugin: registered
Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa5822f0)

and in v310.pre i have

loadplugin Mail::SpamAssassin::Plugin::URIDNSBL

but still spam regarding penis enlargement (some) are getting through.

any other ideas ?



On mið, 2007-01-17 at 11:32 +, Martin.Hepworth wrote:
 Jon
 
 Yes this functionality has been built in since SA version 3.0 (and via
 an additional 'plugin' since 2.6.?4?).
 
 Make sure you are using network tests, Net::DNS perl module is installed
 and the URI-RBL plugin is enabled in the *.pre files which are located
 in the same place as local.cf (normally /etc/mail/spamassassin).
 
 
 --
 Martin Hepworth
 Snr Systems Administrator
 Solid State Logic
 Tel: +44 (0)1865 842300
 
  -Original Message-
  From: Jon Bjorn Njalsson [mailto:[EMAIL PROTECTED]
  Sent: 17 January 2007 10:25
  To: users@spamassassin.apache.org
  Subject: URIBL
 
  Is it possible to have SA find URL in a mail and lookup the ipaddress
  for the URL and check if that ipaddress is listed in some rbl zone and
  score acordingly.
 
  Example, I reveice lot of spam containing URL like
  http://www.thesillyguy.info or thenopers.info and these sites all
  resolve to the same ipaddress 216.40.47.17. Instead of writing rules
  based on these sites is it possible to write a rule based on the
  ipaddress ?
 
 

Net::DNS, version 0.59 is the current version. It upgrades clean
through CPAN or yum

0.57 can have some odd effects.

Nigel


Re: This score makes no sense

2007-01-17 Thread Jason Faulkner



 X-Spam-Status: No, score=4.5 required=5.0 tests=BAYES_05,CLAMAV,
HTML_IMAGE_ONLY_16,HTML_MESSAGE,MIME_HTML_ONLY,REPLY_TO_EMPTY
autolearn=disabled version=3.1.7
  



 5.0 BAYES_99   BODY: Bayesian spam probability is 99 to 100%
[score: 1.]
  
Your header shows BAYES_05, which would be a negative scoring, while 
your second run shows BAYES_99.


That could account for some of the problem.


--
Jason Faulkner
Systems Manager
Broadwick Corporation
(919) 459-2509



Botnet 0.7 error in debug log

2007-01-17 Thread Ben Wylie

I am getting this error in my logs:
[3780] warn: Use of uninitialized value in string eq at 
F:\Perl\site\lib/Mail/SpamAssassin/Plugin/Botnet.pm line 564.


According to my botnet.pm file this line reads:
  if (($tests =~ /nordns/)  ($domain eq )) {

is there any reason why this error should be given?

Thanks
Ben




Re: phoney habeas signature?

2007-01-17 Thread Matt Kettler
Michael Scheidell wrote:
 if this is phoney habeas, I propose a signature to detect it (the web
 link does not exist!, isn't it phoney?)
   
Well, that's technically a valid part of the Habeas SWE mark.

However, SWE is *DEAD*. Habeas does not support SWE  at ALL anymore.
They're now on a more Bonded-Sender like system.

 X-Habeas-SWE-9: mark in spam to http://www.habeas.com/report/.




Re: phoney habeas signature?

2007-01-17 Thread Michael Scheidell
Matt Kettler wrote:
 Michael Scheidell wrote:
   
 if this is phoney habeas, I propose a signature to detect it (the web
 link does not exist!, isn't it phoney?)
   
 
 Well, that's technically a valid part of the Habeas SWE mark.

 However, SWE is *DEAD*. Habeas does not support SWE  at ALL anymore.
 They're now on a more Bonded-Sender like system.

   
 X-Habeas-SWE-9: mark in spam to http://www.habeas.com/report/.
 


   
so, that signature denotes someone who doesn't know this? and as such is
a spam sign?


-- 
Michael Scheidell, CTO
SECNAP Network Security / www.secnap.com
[EMAIL PROTECTED]  / 1+561-999-5000, x 1131



- 
This email has been scanned and certified safe by SpammerTrap(tm) 
For Information please see http://www.spammertrap.com


Re: phoney habeas signature?

2007-01-17 Thread Theo Van Dinter
On Wed, Jan 17, 2007 at 10:36:25AM -0500, Michael Scheidell wrote:
  However, SWE is *DEAD*. Habeas does not support SWE  at ALL anymore.
  They're now on a more Bonded-Sender like system.
  X-Habeas-SWE-9: mark in spam to http://www.habeas.com/report/.

 so, that signature denotes someone who doesn't know this? and as such is
 a spam sign?

I guess it depends on how you define spam sign.  I have 0 spam hits with
this in it, but still some ham which includes it.  So IMO, it's definitely
not a spam sign, and if you've only received one or two spams that tried using
it, I'd ignore it as not worth dealing with.

-- 
Randomly Selected Tagline:
When experiment and theory conflict, experiment wins.   - Tim Smith


pgpIYo0Oji57o.pgp
Description: PGP signature


Re: This score makes no sense

2007-01-17 Thread jdow

From: Chris [EMAIL PROTECTED]

Betcha if you look at it closely it's a phish. ClamAV catches phishes
as well as viruses.

(Just reading the headers makes me think phish. Why would you receive a
TD Canada EasyWeb Online message from costapacific.net?)

{^_^}


Re: This score makes no sense

2007-01-17 Thread jdow

From: Jason Faulkner [EMAIL PROTECTED]




 X-Spam-Status: No, score=4.5 required=5.0 tests=BAYES_05,CLAMAV,
HTML_IMAGE_ONLY_16,HTML_MESSAGE,MIME_HTML_ONLY,REPLY_TO_EMPTY
autolearn=disabled version=3.1.7




 5.0 BAYES_99   BODY: Bayesian spam probability is 99 to 100%
[score: 1.]

Your header shows BAYES_05, which would be a negative scoring, while your 
second run shows BAYES_99.


That could account for some of the problem.


Note THIS line:
 10 CLAMAV Clam AntiVirus detected a virus


That is the standard score for the ClamAV plugin.
{^_^} 



Re: Botnet 0.7 error in debug log

2007-01-17 Thread John Rudd

Ben Wylie wrote:

I am getting this error in my logs:
[3780] warn: Use of uninitialized value in string eq at 
F:\Perl\site\lib/Mail/SpamAssassin/Plugin/Botnet.pm line 564.


According to my botnet.pm file this line reads:
  if (($tests =~ /nordns/)  ($domain eq )) {

is there any reason why this error should be given?



I'll have a fix out, in the near future, that fixes this (sorry, been 
busy with some file server issues the last couple weeks).


Re[2]: getting Bayes token data from spamassassin

2007-01-17 Thread Fred T
Hello Stuart,

Monday, January 15, 2007, 4:54:07 AM, you wrote:

 I've searched around a bit, both on gmane and Google, but I haven't found
 much more information regarding your two points. What IS stored in the
 token field of the table bayes_token? And how is the SHA1 hash involved?
 Where can I find documentation of this? Any suggestions would be greatly
 appreciated.

As a side note, have you seen the reverse engineer sha1 and md5 search engine 
yet?

http://md5.rednoize.com/


-- 
Best regards,
 Fredmailto:[EMAIL PROTECTED]



Re: getting Bayes token data from spamassassin

2007-01-17 Thread Jonas Eckerman
Justin Mason wrote:
 http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Plugin.html#item_bayes_learn

Thanks!

 by the way, a nice, working plugin that does this would be quite useful

Since it was so straight-forward I made a small plugin that collects the raw 
tokens in a SQL table.

I've only been using it for about an hour, so there may be well be problems 
with it. It ought to work though :-)
I've only tested it with MySQL, but it should work without mods with SQLite as 
well I think, and should be trivial to modify for other SQL servers.

If anyone wants to test it, it's called CollectTokens.pm and is available at 
http://whatever.frukt.org/spamassassin.text.shtml. Please tell me when yopu 
find any problems.

What to actually do with the collected data is up to you, but here's two 
example queries:

Top 10 ham tokens:
SELECT bayes_token.ham_count,bayes_rawtoken.rawtoken 
  FROM bayes_rawtoken,bayes_token 
  WHERE bayes_rawtoken.token=bayes_token.token
  ORDER BY bayes_token.ham_count DESC LIMIT 10;

Top 10 spam tokens:
SELECT bayes_token.spam_count,bayes_rawtoken.rawtoken 
  FROM bayes_rawtoken,bayes_token 
  WHERE bayes_rawtoken.token=bayes_token.token
  ORDER BY bayes_token.spam_count DESC LIMIT 10;

Not sure that this is useful for anything at all, but curiosity is part of 
human nature. :-)

Regards
/Jonas

-- 
Jonas Eckerman, FSDB  Fruktträdet
http://whatever.frukt.org/
http://www.fsdb.org/
http://www.frukt.org/



Re: INFO_TLD

2007-01-17 Thread Eric A. Hall

On 1/16/2007 1:52 AM, Eric A. Hall wrote:
 On 1/16/2007 12:06 AM, Theo Van Dinter wrote:
 On Mon, Jan 15, 2007 at 10:44:33PM -0500, Eric A. Hall wrote:
 sa-update nuked INFO_TLD which I was still finding useful
 can somebody with the rule send it to me? thanks

One of the aggressive porno spammers is all about the .info so in case
anybody else is looking for these

uri  INFO_TLD  /\.info(?::\d+)?(?:\/|$)/i
describe INFO_TLD  Contains an URL in the INFO top-level domain
scoreINFO_TLD  1.0

btw, I run with a lot higher than 1.0 here

-- 
Eric A. Hallhttp://www.ehsco.com/
Internet Core Protocols  http://www.oreilly.com/catalog/coreprot/


Re: getting Bayes token data from spamassassin

2007-01-17 Thread Jonas Eckerman
Jonas Eckerman wrote:
 Justin Mason wrote:
 by the way, a nice, working plugin that does this would be quite useful

 Since it was so straight-forward I made a small plugin that collects the raw 
 tokens in a SQL table.

An extra note:

I do not consider my plugin nice since it uses DBI in such an unoptimized 
way. I'm not very good at database programming, and this was a quick hack.

It really should use a prepared statement since it will perform the same 
operation a number of times for every learnt message. It probably should use 
the DELAYED keyword When used with MySQL and MyISAM tables. It should be made 
faster using INSERT with fallback to UPDATE (for the atime) rather than REPLACE 
INTO.

I might do those fixes. Or maybe you'll do them.

Regards
/Jonas

-- 
Jonas Eckerman, FSDB  Fruktträdet
http://whatever.frukt.org/
http://www.fsdb.org/
http://www.frukt.org/



expected speed of sa-learn against local flatfile

2007-01-17 Thread Burton Windle
What is the expected speed of a sa-learn when the bayes db is a local 
flatfile (non-nfs, non-SQL)? Machine has a Celeron 2.26GHz, 512mb RAM, and 
a modern SATA drive formatted as ext3, and is otherwise idle; learning 
does 2-3 messages/second (Spamassassin from Debian Testing's 3.1.7-1) when 
learning batches of 50 - 200 emails.


[EMAIL PROTECTED]:~$ sa-learn --dump magic
0.000  0  3  0  non-token data: bayes db version
0.000  0  23314  0  non-token data: nspam
0.000  0  10547  0  non-token data: nham
0.000  0 267116  0  non-token data: ntokens
0.000  0 1167831004  0  non-token data: oldest atime
0.000  0 1169061004  0  non-token data: newest atime
0.000  0 1169061270  0  non-token data: last journal sync atime
0.000  0 1169040677  0  non-token data: last expiry atime
0.000  0 172800  0  non-token data: last expire atime delta
0.000  0  14141  0  non-token data: last expire reduction 
count

[EMAIL PROTECTED]:~/.spamassassin$ ls -l
total 14756
-rw--- 1 bwindle bwindle  5251072 2007-01-17 14:10 auto-whitelist
-rw--- 1 bwindle bwindle  4997120 2007-01-17 14:14 bayes_seen
-rw--- 1 bwindle bwindle 10375168 2007-01-17 14:14 bayes_toks
-rw-r--r-- 1 bwindle bwindle 5367 2007-01-03 09:28 user_prefs


[EMAIL PROTECTED]:~$ time sa-learn --spam --mbox --progress 
~/mail/SPAM-certainly
100% [===] 
2.51 msgs/sec 00m11s DONE

Learned tokens from 25 message(s) (29 message(s) examined)

real0m14.374s
user0m4.040s
sys 0m0.170s


--
Burton Windle   [EMAIL PROTECTED]



Re: URIBL

2007-01-17 Thread Chris Purves

Jon Bjorn Njalsson wrote:

Is it possible to have SA find URL in a mail and lookup the ipaddress
for the URL and check if that ipaddress is listed in some rbl zone and
score acordingly.

Example, I reveice lot of spam containing URL like
http://www.thesillyguy.info or thenopers.info and these sites all
resolve to the same ipaddress 216.40.47.17. Instead of writing rules
based on these sites is it possible to write a rule based on the
ipaddress ?


Your e-mail hit the following rules for me:

*  3.0 URIBL_BLACK Contains an URL listed in the URIBL blacklist
*  [URIs: thenopers.info thesillyguy.info]
*  4.1 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
*  [URIs: thenopers.info thesillyguy.info]

I have 'loadplugin Mail::SpamAssassin::Plugin::URIDNSBL' set in init.pre.

--
Chris



Re: This score makes no sense

2007-01-17 Thread Chris
On Wednesday 17 January 2007 9:28 am, Jason Faulkner wrote:
   X-Spam-Status: No, score=4.5 required=5.0 tests=BAYES_05,CLAMAV,
  HTML_IMAGE_ONLY_16,HTML_MESSAGE,MIME_HTML_ONLY,REPLY_TO_EMPTY
  autolearn=disabled version=3.1.7
 
   5.0 BAYES_99   BODY: Bayesian spam probability is 99 to 100%
  [score: 1.]

 Your header shows BAYES_05, which would be a negative scoring, while
 your second run shows BAYES_99.

 That could account for some of the problem.
You're right Jason, I have bayes_05 as scoring 

score BAYES_05 0 0 -6.600 -6.600

It makes sense now. As usual I leapt before I looked.

Thanks for pointing that out to me.

Chris

-- 
Chris
KeyID 0xE372A7DA98E6705C
http://learn.to/quote


pgpd41GIQvbEZ.pgp
Description: PGP signature


RE: This score makes no sense

2007-01-17 Thread Dan Barker
Put tests=_TESTSSCORES(,)_ in your local.cf and you won't scratch your head
so hard next time.

It puts in lines like:
X-Spam-Status: No, score=-100.0 required=5.0
tests=AWL=0.009,BAYES_50=0.001,SPF_PASS=-0.001,USER_IN_WHITELIST=-100
autolearn=no version=3.1.7

when coded as:
add_header all Status _YESNO_, score=_SCORE_ required=_REQD_
tests=_TESTSSCORES(,)_ autolearn=_AUTOLEARN_ version=_VERSION_

It would have been hard to get confused with the minus 6 staring you in the
faceg.

Dan



-Original Message-
From: Chris [mailto:[EMAIL PROTECTED]
Sent: Wednesday, January 17, 2007 6:02 PM
To: Jason Faulkner
Cc: users@spamassassin.apache.org
Subject: Re: This score makes no sense


On Wednesday 17 January 2007 9:28 am, Jason Faulkner wrote:
   X-Spam-Status: No, score=4.5 required=5.0 tests=BAYES_05,CLAMAV,
  HTML_IMAGE_ONLY_16,HTML_MESSAGE,MIME_HTML_ONLY,REPLY_TO_EMPTY
  autolearn=disabled version=3.1.7
 
   5.0 BAYES_99   BODY: Bayesian spam probability is 99 to
100%
  [score: 1.]

 Your header shows BAYES_05, which would be a negative scoring, while
 your second run shows BAYES_99.

 That could account for some of the problem.
You're right Jason, I have bayes_05 as scoring

score BAYES_05 0 0 -6.600 -6.600

It makes sense now. As usual I leapt before I looked.

Thanks for pointing that out to me.

Chris

--
Chris
KeyID 0xE372A7DA98E6705C
http://learn.to/quote



Re: getting Bayes token data from spamassassin

2007-01-17 Thread Michael Parker
Jonas Eckerman wrote:
 Justin Mason wrote:
 http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Plugin.html#item_bayes_learn
 
 Thanks!
 
 by the way, a nice, working plugin that does this would be quite useful
 
 Since it was so straight-forward I made a small plugin that collects the raw 
 tokens in a SQL table.
 

Very nice, thats pretty much what I envisioned when I created the plugin
hooks and very similar to my original proof of concept.

If you wanted to reduce the insert/update time you could also do
something like this:
http://jroller.com/page/dschneller?entry=mysql_replication_using_blackhole_engine


Once you have it like you want it, I suggest posting it to the
CustomPlugins wiki page so others can easily find it.

Michael


How to clear all spam headers

2007-01-17 Thread Asif Iqbal

Hi All

I have a email that falsely tagged as spam. Is there a easy way to
clean up all the headers that are added by my mail server and my
spamassassin? I am trying to get a clean copy of the email and pipe to
to sa-learn to learn it as ham.

Thanks

--
Asif Iqbal
PGP Key: 0xE62693C5 KeyServer: pgp.mit.edu


Re: How to clear all spam headers

2007-01-17 Thread Evan Platt

At 04:41 PM 1/17/2007, you wrote:

Hi All

I have a email that falsely tagged as spam. Is there a easy way to
clean up all the headers that are added by my mail server and my
spamassassin? I am trying to get a clean copy of the email and pipe to
to sa-learn to learn it as ham.



I could be wrong, but I'm pretty sure that it's fine to learn a spam 
message with the SA markup.


SpamAssassin is smart enough (I'm sure I'll be corrected if I'm wrong). 



Rule to check a specific language

2007-01-17 Thread Sander Holthaus
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
I'm interested in making rule that adds a small negative score to
email in my native language, dutch. Since I receive very little spam
in dutch (apart from the occassional bulk business offer which is not
considered spam in the Netherlands), seems a good idea, since some
peculiar formatted legitimate dutch mails score close to the threshold.

But how do I do this for just one specific language? I use the TextCat
Plugin for ok_languages, I know SpamAssassin puts the information
 in _LANGUAGES_, so I want a rule that checks  if
_LANGUAGES_ eq 'nl' (bit silly to use /^nl$/). But I can't seem to
find anywhere how to put that in a rule,
since they all check aganiast things like header, body, full, etc.

Kind Regards,
Sander Holthaus
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (MingW32)
 
iD8DBQFFrsaDVf373DysOTURAgNeAKDt2NOS4KEr0Nje8QkEcuZKwg/PVgCZAdLi
x1nRzOqyR1f8uFmhGgu4XwM=
=fYOR
-END PGP SIGNATURE-



Sa-update does nothing??

2007-01-17 Thread Steve Lake
Ok, had some of that new home grown spam getting through recently, 
especially that garbled Russian nonsense, so I ran sa-update to update my 
rules.  It sat there thinking for a second and then dropped to a command 
prompt again.  There's no sign anywhere that it did anything to update my 
rules.  Did I miss something, or possibly miss a step?



Steven Lake
Owner/Technical Writer
Raiden's Realm
www.raiden.net
A friendly web community




Re: How to clear all spam headers

2007-01-17 Thread jdow

From: Evan Platt [EMAIL PROTECTED]


At 04:41 PM 1/17/2007, you wrote:

Hi All

I have a email that falsely tagged as spam. Is there a easy way to
clean up all the headers that are added by my mail server and my
spamassassin? I am trying to get a clean copy of the email and pipe to
to sa-learn to learn it as ham.



I could be wrong, but I'm pretty sure that it's fine to learn a spam 
message with the SA markup.


SpamAssassin is smart enough (I'm sure I'll be corrected if I'm wrong).


Ayup - been doing this for years now.
{^_^}


Re: Sa-update does nothing??

2007-01-17 Thread Theo Van Dinter
On Wed, Jan 17, 2007 at 08:37:43PM -0500, Steve Lake wrote:
 prompt again.  There's no sign anywhere that it did anything to update my 
 rules.  Did I miss something, or possibly miss a step?

Nope.  By default it provides no output other than an exit code (so you
can do things like sa-update  service spamd restart).  If you're
curious about what is going on, run with -D.

-- 
Randomly Selected Tagline:
See, there's still a lot of heat in there. That rice is still cooking. You
 open that lid now, whew, that rice will miss it's one shot at all it
 can be and believe me, a grain is a terrible thing to waste.
   - Alton Brown, Good Eats, Power To The Pilaf
 [on cooking pilaf]


pgpr81CJnvgPg.pgp
Description: PGP signature


Re: Rule to check a specific language

2007-01-17 Thread Theo Van Dinter
On Thu, Jan 18, 2007 at 01:59:47AM +0100, Sander Holthaus wrote:
 But how do I do this for just one specific language? I use the TextCat

ok_languages nl

  in _LANGUAGES_, so I want a rule that checks  if
 _LANGUAGES_ eq 'nl' (bit silly to use /^nl$/). But I can't seem to
 find anywhere how to put that in a rule,
 since they all check aganiast things like header, body, full, etc.

perldoc Mail::SpamAssassin::Plugin::TextCat

:)

-- 
Randomly Selected Tagline:
I cannot have an aide who will not look up. You will be forever walking
 into things. - Dukhat on Babylon 5


pgpIUViFuakfA.pgp
Description: PGP signature


Re: This score makes no sense

2007-01-17 Thread Chris
On Wednesday 17 January 2007 5:16 pm, Dan Barker wrote:
 Put tests=_TESTSSCORES(,)_ in your local.cf and you won't scratch your head
 so hard next time.

 It puts in lines like:
 X-Spam-Status: No, score=-100.0 required=5.0
 tests=AWL=0.009,BAYES_50=0.001,SPF_PASS=-0.001,USER_IN_WHITELIST=-100
 autolearn=no version=3.1.7

 when coded as:
 add_header all Status _YESNO_, score=_SCORE_ required=_REQD_
 tests=_TESTSSCORES(,)_ autolearn=_AUTOLEARN_ version=_VERSION_

 It would have been hard to get confused with the minus 6 staring you in the
 faceg.

 Dan

I guess I'm a bit dense Dan, but putting this:

tests=_TESTSSCORES(,)_

in my local.cf nets this

[EMAIL PROTECTED] ~]$ spamassassin --lint
[28196] warn: config: failed to parse line, skipping: tests=_TESTSSCORES(,)_

So, how confused am I?

-- 
Chris
KeyID 0xE372A7DA98E6705C
http://learn.to/quote


pgpEZKqJB1x5B.pgp
Description: PGP signature


FuzzyOcr::O_NONBLOCK redefined

2007-01-17 Thread Quinn Comendant
Is anybody bothered with SA-related software discussions on this list? I've got 
a FuzzyOCR bug to report.

When I restart spamassassin, I get:

Subroutine FuzzyOcr::O_NONBLOCK redefined at /usr/lib/perl5/5.8.5/Exporter.pm 
line 65.
 at /usr/lib/perl5/5.8.5/i386-linux-thread-multi/POSIX.pm line 19

Quinn

-
Strangecode :: Internet Consultancy
http://www.strangecode.com/
+1 530 624 4410


RE: FuzzyOcr::O_NONBLOCK redefined

2007-01-17 Thread Gary V
Is anybody bothered with SA-related software discussions on this list? I've 
got a FuzzyOCR bug to report.


When I restart spamassassin, I get:

Subroutine FuzzyOcr::O_NONBLOCK redefined at 
/usr/lib/perl5/5.8.5/Exporter.pm line 65.

 at /usr/lib/perl5/5.8.5/i386-linux-thread-multi/POSIX.pm line 19

Quinn



This has been reported a few times. Google for O_NONBLOCK redefined
http://marc.theaimsgroup.com/?l=spamassassin-usersm=116829902909608
http://fuzzyocr.own-hero.net/ticket/16

Gary V

_
The MSN Entertainment Guide to Golden Globes is here.  Get all the scoop. 
http://tv.msn.com/tv/globes2007/?icid=nctagline2




Re: Rule to check a specific language

2007-01-17 Thread Sander Holthaus
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
 
Theo Van Dinter wrote:
 On Thu, Jan 18, 2007 at 01:59:47AM +0100, Sander Holthaus wrote:
 But how do I do this for just one specific language? I use the
 TextCat

 ok_languages nl

 in _LANGUAGES_, so I want a rule that checks  if _LANGUAGES_ eq
 'nl' (bit silly to use /^nl$/). But I can't seem to find anywhere
 how to put that in a rule, since they all check aganiast things
 like header, body, full, etc.

 perldoc Mail::SpamAssassin::Plugin::TextCat

 :)
Perhaps I wasn't entirely clear or I missed something, sorry. I
already have ok_languages in local, but for a much larger set of
languages. I want to make nl a special case, so that I have nl, other
ok_languages and the rest (unwanted ones). I already checked the docs
and had a glimp at the TextCat source, but I don't see how I could do
this without making a rule that would check _LANGUAGES_ against either
'nl' or /^nl/ .

I've looked at these:

http://wiki.apache.org/spamassassin/WritingRules
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Plugin_TextCat.html
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Message_Metadata.html


but I can't them together...

Kind Regards,
Sander Holthaus
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (MingW32)
 
iD8DBQFFruBfVf373DysOTURApYbAKDXcmD4R5BzHfPsouuMDLUAuzPEsgCg5xW1
uU4U+2S/O3HjBUZWdKvCnLY=
=G99l
-END PGP SIGNATURE-



Re: This score makes no sense

2007-01-17 Thread Matt Kettler
Chris wrote:
 On Wednesday 17 January 2007 5:16 pm, Dan Barker wrote:
   
 Put tests=_TESTSSCORES(,)_ in your local.cf and you won't scratch your head
 so hard next time.

 It puts in lines like:
 X-Spam-Status: No, score=-100.0 required=5.0
 tests=AWL=0.009,BAYES_50=0.001,SPF_PASS=-0.001,USER_IN_WHITELIST=-100
 autolearn=no version=3.1.7

 when coded as:
 add_header all Status _YESNO_, score=_SCORE_ required=_REQD_
 tests=_TESTSSCORES(,)_ autolearn=_AUTOLEARN_ version=_VERSION_

 It would have been hard to get confused with the minus 6 staring you in the
 faceg.

 Dan

 
 I guess I'm a bit dense Dan, but putting this:

 tests=_TESTSSCORES(,)_

 in my local.cf nets this

 [EMAIL PROTECTED] ~]$ spamassassin --lint
 [28196] warn: config: failed to parse line, skipping: tests=_TESTSSCORES(,)_

 So, how confused am I?
   
That's all one line, eliminate the line-wrap between _REQD_ and tests.




Re: How to clear all spam headers

2007-01-17 Thread Matt Kettler
Asif Iqbal wrote:
 Hi All

 I have a email that falsely tagged as spam. Is there a easy way to
 clean up all the headers that are added by my mail server and my
 spamassassin? I am trying to get a clean copy of the email and pipe to
 to sa-learn to learn it as ham.

 Thanks

1) you don't have to do this, sa-learn automatically removes anything
that SA itself added.

2) if you pipe it through spamassassin --remove-markup it should clean
it up.


Re: expected speed of sa-learn against local flatfile

2007-01-17 Thread Matt Kettler
Burton Windle wrote:
 What is the expected speed of a sa-learn when the bayes db is a local
 flatfile (non-nfs, non-SQL)? Machine has a Celeron 2.26GHz, 512mb RAM,
 and a modern SATA drive formatted as ext3, and is otherwise idle;
 learning does 2-3 messages/second (Spamassassin from Debian Testing's
 3.1.7-1) when learning batches of 50 - 200 emails.

That sounds about right, but it would depends a lot on the speed of that
SATA drive and the size of the messages, etc, etc.

Last time a published bayes benchmark was run on a 2.8g p4 was 200
messages in 124 seconds. That's 16 messages per second, but its a larger
batch and a slightly faster machine. There's no mention of the disk
config, so it might be a raid array which would generate a considerable
performance difference compared to a single SATA drive.

See: http://wiki.apache.org/spamassassin/BayesBenchmark






Re: This score makes no sense

2007-01-17 Thread Chris
On Wednesday 17 January 2007 8:55 pm, Matt Kettler wrote:

  [EMAIL PROTECTED] ~]$ spamassassin --lint
  [28196] warn: config: failed to parse line, skipping:
  tests=_TESTSSCORES(,)_
 
  So, how confused am I?

 That's all one line, eliminate the line-wrap between _REQD_ and tests.

Thanks Matt, that fixed it.

-- 
Chris
KeyID 0xE372A7DA98E6705C
http://learn.to/quote


pgpL6TL3wCCTM.pgp
Description: PGP signature


Re: FuzzyOCR 3.5.1 not using FUZZY_OCR rule when using hash

2007-01-17 Thread Quinn Comendant
On Mon, 15 Jan 2007 18:17:14 -0800, Quinn Comendant wrote:
 When I set focr_enable_image_hashing 2:
[...]

HINT: I notice from http://fuzzyocr.own-hero.net/wiki/WhatisFuzzyOcr that this 
email should be tagged with FUZZY_OCR_KNOWN_HASH but note in my previous email 
this wasn't included in my spamc -R report.

Also, I've added this issue to ticket #62: 
http://fuzzyocr.own-hero.net/ticket/62

Q


-
Strangecode :: Internet Consultancy
http://www.strangecode.com/
+1 530 624 4410