ZMI

2006-09-13 Thread hamann . w
what is the current home of the ZMI (german) ruleset?

Wolfgang Hamann





Re: Which DB is actually used?

2006-09-13 Thread Bo Mellberg



jdow skrev:

From: Logan Shaw [EMAIL PROTECTED]


On Fri, 8 Sep 2006, Bo Mellberg wrote:
It seems like the exim-users database is being touched regularly, so 
I'm guessing that it has been set up by apt-get in some 
auto-learning state.


Yes, you might want to check whatever's running SpamAssassin and
see what user it's running as and also check the configuration
files (probably in /etc/mail/spamassassin) to see where it's
storing the database.

I have earlier trained spam and ham as user bosse, which is why 
there is a working db there as well.


As I am the only user on my system, it really doesn't matter if I use 
site-wide or not, but rather how I invoke sa-learn.


Lets say I remove the databases for bosse and root. Is this the 
proper  way to invoke sa-learn:


1. Log on as user bosse
2. sa-learn --showdots --sync --dbpath /var/spool/exim4/.spamassassin 
--spam /home/bosse/Maildir/.MissedSpam/cur


Probably not, or at least not the best way.


Absolutely not. The database under bosse is quite apparently not
being used except for his misplaced training. He needs to su -l exim4
and then run sa-learn.


I thought that this was what --dbpath was meant for. To tell sa-learn 
what database to actually update. In the case above, the exim DB is 
trained with spam from the bosse-user. So IF the exim DB is the one 
used for spam control, it would with the above command be the one 
trained, no?


A better solution is ofcourse to tell SA to use per user databases and 
log on as bosse and train normally. I'll do some RTFM and googling to 
see how the setup for Debian is actually made.


/Bo


Ask about more detail rule description

2006-09-13 Thread L's

I want to know is there any document, that for the rules description in a
simple words?

Since the description generated by X-Spam-Report: is too technical.

Does there any document description easier to read than:
[http://spamassassin.apache.org/tests_3_1_x.html]

-- 
View this message in context: 
http://www.nabble.com/Ask-about-more-detail-rule-description-tf2263654.html#a6281047
Sent from the SpamAssassin - Users forum at Nabble.com.



Re: ZMI

2006-09-13 Thread Jeremy Fairbrass
AFAIK it's currently residing at http://zmi.at/x/70_zmi_german.cf

- Jeremy



[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 what is the current home of the ZMI (german) ruleset?

 Wolfgang Hamann 





RE: Anyone get the Sa coach outlook plugin to work?

2006-09-13 Thread Michael Scheidell

 -Original Message-
 From: Tim Litwiller [mailto:[EMAIL PROTECTED] On Behalf Of 
 Tim Litwiller
 Sent: Tuesday, September 12, 2006 11:38 PM
 To: SpamAssassin Users List
 Subject: Re: Anyone get the Sa coach outlook plugin to work?
 
 
 on a related note: How do you make spamd listen on port 783 - when I 
 telnet to that port it times out - I get no answer.

man spamd

It only listens to localhost (-i option) and only allows access from ip
127.0.0.1 (-A option)


I have a 'sa-learnd' that implements a subset of spamd (ping/pong/TELL
SPAM) and allows you to set a -r (report) flag to so either sa-learn
--spam style learning, or add in spamassassin -r (report) also.

Released under GPL2 if anyone wants to try it.

No documentation, no installer yet, only tarball and sample rc file.

Requires *nix and perl, haven't tried it on windows, and not likely to.



spamassassin --lint just hangs

2006-09-13 Thread Ramprasad
I find that 
spamassassin -D --lint sometimes just hangs.

the output goes 
.
..
[28316] dbg: bayes: tie-ing to DB file
R/W /var/spool/MailScanner/spamassassin/bayes_toks
[28316] dbg: bayes: tie-ing to DB file
R/W /var/spool/MailScanner/spamassassin/bayes_seen
[28316] dbg: bayes: found bayes db version 3
[28316] dbg: locker: refresh_lock:
refresh /var/spool/MailScanner/spamassassin/bayes.mutex

(Thats it .. here it waits for ever ) 

I  have got a busy system and a bayes_toks file of 32MB 
I tried to strace the pid of the process .. could see a lots of
pread/pwrite

any idea whats going on ? 

Thanks
Ram




RE: spamassassin --lint just hangs

2006-09-13 Thread Sietse van Zanen



Might be a corrupted database. Try moving it and start with a clean one. If the lint succeeds it is your bayes db.

-Sietse


From: RamprasadSent: Wed 13-Sep-06 13:25To: spamassassin-usersSubject: spamassassin --lint just hangs
I find that 
spamassassin -D --lint sometimes just hangs.

the output goes 
.
..
[28316] dbg: bayes: tie-ing to DB file
R/W /var/spool/MailScanner/spamassassin/bayes_toks
[28316] dbg: bayes: tie-ing to DB file
R/W /var/spool/MailScanner/spamassassin/bayes_seen
[28316] dbg: bayes: found bayes db version 3
[28316] dbg: locker: refresh_lock:
refresh /var/spool/MailScanner/spamassassin/bayes.mutex

(Thats it .. here it waits for ever ) 

I  have got a busy system and a bayes_toks file of 32MB 
I tried to strace the pid of the process .. could see a lots of
pread/pwrite

any idea whats going on ? 

Thanks
Ram





Broken Dependency

2006-09-13 Thread Jonathan Allen
Hi All,

I'm running 3.1.5 in FC2 and have a broken dependency.  Any ideas how to
fix it ?

   meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK'

Jonathan


Re: Broken Dependency

2006-09-13 Thread Jonathan Allen
Theo,

 There's nothing broken.  The message is informational, not an error. (it means
 you don't have the DCC plugin loaded which would define that rule, which is
 fine if you're not using DCC...)

Thank you for explaining that.  I have been trying to work out why some
spams made it through SA and found that in the log.  Running SA -t on
the (saved) spam told me: Broken Pipe and running it with -D as well
showed that it broke during base64 decoding - I assume the attachment,
a GIF, was badly encoded.  Otherwise it looked just the usual text +
something in a GIF spam.

Jonathan


Re: Broken Dependency

2006-09-13 Thread Theo Van Dinter
On Wed, Sep 13, 2006 at 12:44:47PM +0100, Jonathan Allen wrote:
 I'm running 3.1.5 in FC2 and have a broken dependency.  Any ideas how to
 fix it ?
meta test DIGEST_MULTIPLE has undefined dependency 'DCC_CHECK'

There's nothing broken.  The message is informational, not an error. (it means
you don't have the DCC plugin loaded which would define that rule, which is
fine if you're not using DCC...)

-- 
Randomly Generated Tagline:
Warning: Any government that tries to protect citizens from every
 conceivable risk must necessarily resort to tyranny.
 - Randy Cassingham, This is True Mailing List


pgpvT9RjhLZgM.pgp
Description: PGP signature


Testing whitelist

2006-09-13 Thread Beginner
Hi,

=== SyS Stuff 
SpamAssassin version 3.0.3
  running on Perl version 5.8.4

Exim 4.2, on Debian 3.1, sitewide config.

/usr/sbin/spamd --nouser-config --max-children 6 --helper-home-
dir=/var/spool/spamassassin/ --username=nobody -d --
pidfile=/usr/local/run/spamd.pid



I have recently have a increase in false positives, mostly as a 
result of rbl checks. I have been relying on whitelist to ensure mail 
from some addresses gets through but I can not confirm that it is 
working. It actually looks like it isn't using the whitelist at all.

If I run `spamassassin -D  test.eml` there is no reference in the 
output to the whitelist file.

If I run spamassasin -D --lint is does refer to the whitelist file:

debug: lock: 18407 created /usr/local/spl-mail/conf/auto-
whitelist.lock.myserver.mydomain.com.18407
debug: lock: 18407 trying to get lock on /usr/local/my-mail/conf/auto-
whitelist with 0 retries
debug: lock: 18407 link to /usr/local/my-mail/conf/auto-
whitelist.lock: link ok
debug: Tie-ing to DB file R/W in /usr/local/my-mail/conf/auto-
whitelist

However addresses that are added to the whitelist do not seem to be 
getting through. I have the following string in the whitelist

[EMAIL PROTECTED]|ip=none|totscore
-100

I think that means that they are whitelisted but mail from them still 
scores more than 7 and they are bounced.

Is there some way to read out the whitelist and blacklist? I could at 
least confirm that someone is in the right list, or are they one and 
the same?

In the extract below the above user score 8.0. If they started at -
100 that shouldn't be possible (temporary reject is +7).

  X-Spam-Level: 
  X-Spam-Status: No, score=8.0 required=9.0 
tests=BAYES_99,HTML_90_100,HTML_MESSAGE,MIME_HTML_MOSTLY,
NO_REAL_NAME autolearn=no version=3.0.3


Can anyone offer any advice please? I am at a lose.
TIA.
Dp.

=== local.cf ===

#
report_safe 0
rewrite_header subject SPAM: _HITS_:
trusted_networks 194.200.237.128/25 127.0.0.1/32
bayes_path /etc/spamassassin/bayes/bayes
#use_bayes 0
#auto_whitelist_path /etc/spamassassin/auto-whitelist
use_auto_whitelist 1
auto_whitelist_path /usr/local/spl-mail/conf/auto-whitelist
required_score 9.0
score BAYES_99 4.0
score FORGED_RCVD_HELO 2.9
score RCVD_HELO_IP_MISMATCH 4.9
score URIBL_SC_SURBL 6.5
score URIBL_WS_SURBL 6.5
score URIBL_SBL 6.5
score URIBL_OB_SURBL 6.5
score RCVD_IN_NJABL_DUL 6.5
score RCVD_IN_SORBS_DUL 6.5
score RCVD_IN_BL_SPAMCOP_NET 6.5
score DRUGS_PAIN 1.0
score DRUGS_ERECTILE 2.0
.

snip

==


Re: Testing whitelist

2006-09-13 Thread Matt Kettler
Beginner wrote:
 Hi,

 === SyS Stuff 
 SpamAssassin version 3.0.3
   running on Perl version 5.8.4

 Exim 4.2, on Debian 3.1, sitewide config.
   
I hope that 3.0.3 version is the one that Debian patched to fix the two
security holes that exist in the original 3.0.3. (AFAIK Debian did
backport the fixes, and made a 3.0.3-x release)

See: http://wiki.apache.org/spamassassin/Security

 /usr/sbin/spamd --nouser-config --max-children 6 --helper-home-
 dir=/var/spool/spamassassin/ --username=nobody -d --
 pidfile=/usr/local/run/spamd.pid

 

 I have recently have a increase in false positives, mostly as a 
 result of rbl checks. I have been relying on whitelist to ensure mail 
 from some addresses gets through but I can not confirm that it is 
 working. It actually looks like it isn't using the whitelist at all.

 If I run `spamassassin -D  test.eml` there is no reference in the 
 output to the whitelist file.

 If I run spamassasin -D --lint is does refer to the whitelist file:

 debug: lock: 18407 created /usr/local/spl-mail/conf/auto-
 whitelist.lock.myserver.mydomain.com.18407
 debug: lock: 18407 trying to get lock on /usr/local/my-mail/conf/auto-
 whitelist with 0 retries
 debug: lock: 18407 link to /usr/local/my-mail/conf/auto-
 whitelist.lock: link ok
 debug: Tie-ing to DB file R/W in /usr/local/my-mail/conf/auto-
 whitelist

 However addresses that are added to the whitelist do not seem to be 
 getting through. I have the following string in the whitelist

 [EMAIL PROTECTED]|ip=none|totscore
 -100

 I think that means that they are whitelisted but mail from them still 
 scores more than 7 and they are bounced.

 Is there some way to read out the whitelist and blacklist? 

Well, first, realize this is the AWL, which is called the auto
whitelist but it's NOT really a whitelist. It's a score-averager that
results in automatic white and blacklist behaviors.

I would not depend on it to auto-fix problems with particular senders.
It's intended to fix problems where a sender you frequently communicate
with occasionally sends a message that's slightly spam-like in
appearance. It cannot fix problems of a constant nature, as these will
just fold into the averages.

See
http://wiki.apache.org/spamassassin/AutoWhitelist

and:
http://wiki.apache.org/spamassassin/AwlWrongWay


As for extracting the AWL, you'll need the check-whitelist script. This
comes in the source tarball in the tools directory, but most
distro-packages do not install it.
You can get it by downloading the 3.0.3 tarball from:
http://archive.apache.org/dist/spamassassin/


Re: Testing whitelist

2006-09-13 Thread Dermot Paikkos
On 13 Sep 2006 at 10:21, Matt Kettler wrote:
 Beginner wrote:

 I hope that 3.0.3 version is the one that Debian patched to fix the
 two security holes that exist in the original 3.0.3. (AFAIK Debian did
 backport the fixes, and made a 3.0.3-x release)
 
 See: http://wiki.apache.org/spamassassin/Security

I wasn't aware of this but am looking into it now. 

 
 Well, first, realize this is the AWL, which is called the auto
 whitelist but it's NOT really a whitelist. It's a score-averager that
 results in automatic white and blacklist behaviors.
 
 I would not depend on it to auto-fix problems with particular senders.
 It's intended to fix problems where a sender you frequently
 communicate with occasionally sends a message that's slightly
 spam-like in appearance. It cannot fix problems of a constant nature,
 as these will just fold into the averages.
 
 See
 http://wiki.apache.org/spamassassin/AutoWhitelist
 
 and:
 http://wiki.apache.org/spamassassin/AwlWrongWay

Does that mean the only way to whitelist senders is manually via the 
local.cf as I have disabled user_prefers? If so, what would be the 
best method allow mortal users (via http) to whitelist senders. I had 
been using `$f-add_address_to_whitelist ($addr)` but that seems to 
specifically add them to the whitelist DB.

 As for extracting the AWL, you'll need the check-whitelist script.
 This comes in the source tarball in the tools directory, but most
 distro-packages do not install it. You can get it by downloading the
 3.0.3 tarball from: http://archive.apache.org/dist/spamassassin/

Perhaps this is no longer necessary. What I really need is a way to 
ensure that someone reports that so-and-so's mail is being bounced I 
can ensure their emails get through regardless.

Any other thoughts?


Re: Testing whitelist

2006-09-13 Thread Theo Van Dinter
On Wed, Sep 13, 2006 at 03:42:12PM +0100, Dermot Paikkos wrote:
 Does that mean the only way to whitelist senders is manually via the 
 local.cf as I have disabled user_prefers? If so, what would be the 

If you want something specifically always whitelisted, yes, it needs
a whitelist_* config somewhere.  If user prefs are disabled, it would
need to be in a site-wide config file, though not necessarily local.cf
(*.cf is fine).

 best method allow mortal users (via http) to whitelist senders. I had 
 been using `$f-add_address_to_whitelist ($addr)` but that seems to 
 specifically add them to the whitelist DB.

There's no SA function that will force a whitelist/create a config file
for you.  If you have a web interface already, I'd add some code to
allow users to paste in the headers of a message they want whitelisted.
From there, you can parse out the information (using SA function if you
like) to create a whitelist entry (try to do whitelist_from_rcvd, and
only failback to whitelist_from if necessary since it's easily forged).
I'd probably save that info in a DB or something, and then periodically
update a cf file and restart spamd.

Even better, if someone wants a sender whitelisted, do it in whatever you have
calling SA if possible.

-- 
Randomly Generated Tagline:
Windows and MacOS are products, contrived by engineers in the service of
 specific companies. Unix, by contrast, is not so much a product as it is a
 painstakingly compiled oral history of the hacker subculture. - N. Stephenson


pgpqE404cnS7J.pgp
Description: PGP signature


RE: Ask about more detail rule description

2006-09-13 Thread Bowie Bailey
L's wrote:
 I want to know is there any document, that for the rules description
 in a simple words?
 
 Since the description generated by X-Spam-Report: is too technical.
 
 Does there any document description easier to read than:
 [http://spamassassin.apache.org/tests_3_1_x.html]

Those are the only descriptions we have for the rules.  If that is not
detailed enough, you'll have to take a look at the regex and see what 
it is doing.

-- 
Bowie


Re: Testing whitelist

2006-09-13 Thread Beginner
On 13 Sep 2006 at 10:50, Theo Van Dinter wrote:

 If you want something specifically always whitelisted, yes, it needs
 a whitelist_* config somewhere.  If user prefs are disabled, it would
 need to be in a site-wide config file, though not necessarily local.cf
 (*.cf is fine).

That sounds reasonable. I'll create a file (whitelist.cf) for 
manually whitelisting senders. Am I right in thinking that I will 
need to HUP SA after each edit?

  best method allow mortal users (via http) to whitelist senders. I had 
  been using `$f-add_address_to_whitelist ($addr)` but that seems to 
  specifically add them to the whitelist DB.
 
 There's no SA function that will force a whitelist/create a config file
 for you.  If you have a web interface already, I'd add some code to
 allow users to paste in the headers of a message they want whitelisted.
 From there, you can parse out the information (using SA function if you
 like) to create a whitelist entry (try to do whitelist_from_rcvd, and
 only failback to whitelist_from if necessary since it's easily forged).
 I'd probably save that info in a DB or something, and then periodically
 update a cf file and restart spamd.

I agree using the header is the best method, but I can't imagine my 
users cutting and pasting headers. I think a saved email, uploaded 
and parsed, as you say with `parse` would be a easier route for my 
users to take. The only other concern is security of the files in 
/etc/spamassassin.  That needs some thought, perhaps SUExec might 
help.

 Even better, if someone wants a sender whitelisted, do it in whatever you have
 calling SA if possible.

I have tried this but without success. I am using Exim (exim 4.5 and 
sa-exim) and I appear to have lost the access control that exim 
provides. The would also have the same security problem with this 
method, EG: allowing a httpd to write to file in /etc. Still it might 
be a worth investigating. 

Thanx for taking the time and the advice. I am a bit clearer now.
Regards,
Dp.




RE: Ask about more detail rule description

2006-09-13 Thread John D. Hardin
On Wed, 13 Sep 2006, Bowie Bailey wrote:

 L's wrote:
  I want to know is there any document, that for the rules description
  in a simple words?
  
  Since the description generated by X-Spam-Report: is too technical.
  
  Does there any document description easier to read than:
  [http://spamassassin.apache.org/tests_3_1_x.html]
 
 Those are the only descriptions we have for the rules.  If that is not
 detailed enough, you'll have to take a look at the regex and see what 
 it is doing.

...I doubt *that* will be less technical... :)

L's, which of the descriptions aren't clear? Admittedly some of them
are based on violations of the standards and are not going to be
meaningful to anyone who does not understand the standards, but what's
unclear about, for example, Message body has 90-100% blank lines ?

A suggestion: one wiki page for each rule where the description is too
short to be clear, on which the rule's meaning and rationale is
explained in whatever level of detail is considered adequate and
clear.

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.-- Red Drag Diva
---
 4 days until The 219th anniversary of the signing of the U.S. Constitution




Re: Rule help needed

2006-09-13 Thread kavaXtreme

Thanks for your VERY helpful input. That's exactly the kind of stuff they
don't tend to cover in a general overview of how to write rules, and exactly
the kind of stuff I need to know.

Unfortunately SpamAssassin is pretty hobbled on a Cpanel account on a shared
server. I contacted the help desk to clarify, and neither of my web hosts
allows custom rules (or Razor2, DCC, Pyzor). And I'm not ready to spring for
a dedicated server right now.

So, it looks like I'm going to have to use a less elegant approach: have
SpamAssassin flag spam, have e-mail filters send spam with a high score to a
mailbox for SA to learn from, and use a script to process the contents. Oh
well.




Loren Wilton wrote:
 
 header  ROMPE_BADRECIPS  To =~ /(uucp|majordomo|root)[EMAIL PROTECTED]/i
 
 Bowie has answered your questions.  A couple of comments on the regex
 above.
 
 You should be using (?: instead of just ( to introduce the group.  Without 
 the ?: it is a capturing group that will capture the text found.  But you 
 aren't using the captured text, so this just considerably slows down the 
 regex processing.
 
 You also should escape the dot in .com.  What you have now will match any 
 character, not just a dot.
 
 You should probably also make sure that there is a word break before the 
 username, so that you don't inadvertantly hit on mymajordomo or similar.
 
 So you end up with:
 
 To =~ /\b(?:uucp|majordomo|root)[EMAIL PROTECTED]/i
 
 
 Loren
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Rule-help-needed-tf2260084.html#a6287888
Sent from the SpamAssassin - Users forum at Nabble.com.



Re: Testing whitelist

2006-09-13 Thread Theo Van Dinter
On Wed, Sep 13, 2006 at 04:14:41PM +0100, Beginner wrote:
 That sounds reasonable. I'll create a file (whitelist.cf) for 
 manually whitelisting senders. Am I right in thinking that I will 
 need to HUP SA after each edit?

Yes.

 provides. The would also have the same security problem with this 
 method, EG: allowing a httpd to write to file in /etc. Still it might 
 be a worth investigating. 

Well, don't write directly to the files. ;)   I would recommend just saving
the information into a DB or other file, then generate the appropriate config
files from there using a paranoid script w/ the appropriate security privs.

-- 
Randomly Generated Tagline:
Never go off on tangents, which are lines that intersect a curve at only
 one point and were discovered by Euclid, who live in the 6th century,
 which was an era dominated by the Goths, who lived in what we now know
 as Poland. - Unknown from Nov. 1998 issue of Infosystems Executive.


pgphRt5fzSHXf.pgp
Description: PGP signature


Re: Rule help needed

2006-09-13 Thread kavaXtreme

Man, I wish I'd tried asking my question here a LONG time ago. You guys have
been so helpful! Thanks a ton! You rock!
-- 
View this message in context: 
http://www.nabble.com/Rule-help-needed-tf2260084.html#a6287909
Sent from the SpamAssassin - Users forum at Nabble.com.



RE: Ask about more detail rule description

2006-09-13 Thread Bowie Bailey
John D. Hardin wrote:
 On Wed, 13 Sep 2006, Bowie Bailey wrote:
 
  L's wrote:
   I want to know is there any document, that for the rules
   description in a simple words? 
   
   Since the description generated by X-Spam-Report: is too
   technical. 
   
   Does there any document description easier to read than:
   [http://spamassassin.apache.org/tests_3_1_x.html]
  
  Those are the only descriptions we have for the rules.  If that is
  not detailed enough, you'll have to take a look at the regex and
  see what it is doing.
 
 ...I doubt *that* will be less technical... :)

Well, the subject line is Ask about more detail rule description.

Which is it?  More detail or less technical? :)

-- 
Bowie


perfecting Bayes

2006-09-13 Thread benthere-nine
I can't complain but sometimes I still do. - Joe
Walsh

Our SA 3.1.4 server's running great.  The Bayes scan
is set to auto-learn and is running fine.  According
to sa-learn --dump magic, we have

73352  non-token data: nspam
8453   non-token data: nham
220226 non-token data: ntokens

To my knowledge, we have had 0 false positives.  It's
so good we've upped the scores considerably:

SCORE BAYES_99  3.5
SCORE BAYES_95  3.0
SCORE BAYES_80  2.0
SCORE BAYES_60  1.5

Unfortuately, some of the same format of text spam
still gets scored BAYES_50 or less.  We keep running
sa-learn --spam on these pesky critters, the database
learns tokens, but there's seemingly no progress.

No one ever said the Bayes scan would be perfect, but
that's what we want.

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


Re: perfecting Bayes

2006-09-13 Thread Theo Van Dinter
On Wed, Sep 13, 2006 at 09:08:02AM -0700, benthere-nine wrote:
 No one ever said the Bayes scan would be perfect, but
 that's what we want.

Patches welcome. ;)

-- 
Randomly Generated Tagline:
Brevity is the soul of wit... - Hamlet


pgpMGfSIGkYkA.pgp
Description: PGP signature


Re: perfecting Bayes

2006-09-13 Thread Justin Mason

Theo Van Dinter writes:
 On Wed, Sep 13, 2006 at 09:08:02AM -0700, benthere-nine wrote:
  No one ever said the Bayes scan would be perfect, but
  that's what we want.
 
 Patches welcome. ;)

One thing I was thinking of was plugins to supplement Bayes with other
forms of machine learning algorithms.  For example, logistic regression,
Andrej Bratko's PPM-D compression classifier, and others seem to be coming
out with consistently good results in recent testing -- there's no need
for SpamAssassin's only user-trainable machine learning component to be
Robinson-style Bayes, if it's possible to get good results with other
systems too ;)

--j.


Re: perfecting Bayes

2006-09-13 Thread Clay Davis
You can't always get what you want - Rolling Stones  :-)

Clay

 On 9/13/2006 at 12:08 PM, in message
[EMAIL PROTECTED],
benthere-nine [EMAIL PROTECTED] wrote:
I can't complain but sometimes I still do. - Joe
Walsh

snip

No one ever said the Bayes scan would be perfect, but
that's what we want.


Fishing

2006-09-13 Thread Fábio Gomes
Hi list,

Is there any way to block messages with links to executables like 
*.exe, 
*.com and *.scr?

Best Regards,
Fábio Gomes


Re: Fishing

2006-09-13 Thread Evan Platt

At 11:09 AM 9/13/2006, you wrote:

Hi list,

Is there any way to block messages with links to 
executables like *.exe,

*.com and *.scr?


Not with SpamAssassin, but possibly with whatever MUA you have. 





Re: Fishing

2006-09-13 Thread Ed Kasky

At 11:10 AM Wednesday, 9/13/2006, Michel Vaillancourt wrote -=

Fábio Gomes wrote:
 Hi list,

   Is there any way to block messages with 
links to executables like *.exe,

 *.com and *.scr?

   Best Regards,
   Fábio Gomes

If you are using Postfix as your MTA, this isn't hard to do at all.


Or - if you are using procmail:

#Delete all messages with exe attachments
:0
* ^content-type: application/octet-stream
/dev/null

Ed Kasky
~
Randomly Generated Quote (461 of 511):
The truth is a precious commodity. That's why I use it so sparingly.
- Mark Twain



Re: Fishing

2006-09-13 Thread Fábio Gomes
I didn't mean removing EXE attachments, but blocking/high scoring messages 
with links to executables in its body.

Is it possible?

BTW, I'm using qmail.

Regards,
Fábio Gomes

Em Quarta 13 Setembro 2006 15:34, Ed Kasky escreveu:
 At 11:10 AM Wednesday, 9/13/2006, Michel Vaillancourt wrote -=

 Fábio Gomes wrote:
   Hi list,
  
 Is there any way to block messages with
 
  links to executables like *.exe,
 
   *.com and *.scr?
  
 Best Regards,
 Fábio Gomes
 
  If you are using Postfix as your MTA, this isn't hard to do at
  all.

 Or - if you are using procmail:

 #Delete all messages with exe attachments

 :0

 * ^content-type: application/octet-stream
 /dev/null

 Ed Kasky
 ~
 Randomly Generated Quote (461 of 511):
 The truth is a precious commodity. That's why I use it so sparingly.
 - Mark Twain


Re: Fishing

2006-09-13 Thread Bill Randle

 At 11:10 AM Wednesday, 9/13/2006, Michel Vaillancourt wrote -=
Fábio Gomes wrote:
  Hi list,
 
Is there any way to block messages with
 links to executables like *.exe,
  *.com and *.scr?
 

 If you are using Postfix as your MTA, this isn't hard to do at
 all.

 Or - if you are using procmail:

 #Delete all messages with exe attachments
 :0
 * ^content-type: application/octet-stream
 /dev/null

Amavisd-new will also drop attachments with a configurable list
of file extentions, but the question refered to links to exe's,
not actual exe attachments.

-Bill



-- 



Re: Fishing

2006-09-13 Thread Kelson

Bill Randle wrote:

Amavisd-new will also drop attachments with a configurable list
of file extentions, but the question refered to links to exe's,
not actual exe attachments.


Good point -- everyone's primed to think of attachments, it seems.

Here's a stab at it: set up a URI rule.

uri   EXECUTABLE_LINK/\.(?:exe|scr)$/i
describe  EXECUTABLE_LINKLinks to an executable file
score EXECUTABLE_LINK10

Just a starting place, mind you -- you may want to make it more or less 
specific.  And there may still be the occasional site running a binary 
CGI on Windows, such that the server will execute the EXE and output 
HTML, not offer the EXE for download.


.com will, of course, be a challenge.

--
Kelson Vibber
SpeedGate Communications www.speed.net


Re: Setting up DKIM and DomainKeys mail signing and verification

2006-09-13 Thread Mark Martinec
SM, and others:

 Both milters are being maintained and are similar in reliability.
 dk-milter is not fading in oblivion as there are more domains signing
 with DomainKeys than DKIM.

Usage of dk-milter may not be fading, but the interest in fixing standing bugs
seems to be lost (contrary to dkim-milter, where MSK is there, willing to fix 
flaws - thanks Murray!).

Three of my dk-milter bug reports (a month old) are about broken signatures, 
and nobody seems to care. Don't know how serious other older unresolved 
problem reports are. I can only conclude that, either it is not widely used, 
or people do not care if certain types of mail messages are incorrectly 
handled. An 'it works for me' attitude I guess.

I'd be delighted if proven wrong!


 | score DK_VERIFIED -1.5
 Note that some spam is DK signed.

True, there is a paragraph about that by the end of my text.

Nevertheless, I think it is worth giving a little global motivation
for people to start signing their mail - both to spammers and to regular
users alike. We may lose 1.5 score points to some spam (which may let
through a few more marginal spam messages below the gate), but we gain
a bit more information about spammer - is a mail was signed, we know
the sender is a current owner of the domain, not just an anonymous
controller of an army of spambots. This may bring more pressure
to registrars to trim down domain kiting practices, and to bigISPs
to better control their user base or risk being given few blacklist
points. And the 1.5 points is not always a loss, in many cases
it saves a legitimate message from being treated as false positive.

It is easier to steer a river flow to where we want it to go,
than to be shuffling water in few month's time. Or taken another
analogy, in words of a flower (Antoine de Saint-Exupery):

  Well, I must endure the presence of two or three caterpillars
  if I wish to become acquainted with the butterflies. It seems
  that they are very beautiful.


* both the dkim-milter 0.5.1 and the dk-milter 0.4.1 need a patch as
  described in the Postfix documentation file MILTER_README.

 IIRC, the Workarounds section of the Postfix documentation file is
 being read incorrectly.   Dkim-milter and dk-milter do not require any
 patch.

Well, the word 'needs' may be too strong, both milters work without
the patch as well, but the log is ugly:

without patch:

dkim-filter[76335]: Sendmail DKIM Filter v0.5.1 starting ...
dkim-filter[76335]: WARNING: sendmail symbol 'i' not available
dkim-filter[76335]: (unknown-jobid): no signature data
dkim-filter[76335]: (unknown-jobid): no signature data
dkim-filter[76335]: (unknown-jobid): no signature data
dkim-filter[76380]: (unknown-jobid): can't parse From: header
dkim-filter[76335]: (unknown-jobid): no signature data

with the patch (taken from its repository at SourceForge):

dkim-filter[74857]: Sendmail DKIM Filter v0.5.1 starting ...
dkim-filter[66366]: 11D2117B8F2: no signature data
dkim-filter[66366]: 9748F17B8E8: no signature data
dkim-filter[66366]: 5A9A017B8E1: no signature data


  Mark




Re: Fishing

2006-09-13 Thread John D. Hardin
On Wed, 13 Sep 2006, [iso-8859-1] F?bio Gomes wrote:

   Is there any way to block messages with links to executables like 
 *.exe, 
 *.com and *.scr?

I will be adding that to my email security tool this week.

http://www.impsec.org/email-tools/procmail-security.html

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.-- Red Drag Diva
---
 4 days until The 219th anniversary of the signing of the U.S. Constitution



Re: Fishing

2006-09-13 Thread Steve Thomas
 .com will, of course, be a challenge.

/htt[p|ps]:\/\/.*?\/.*\.com$/i





Re: Fishing

2006-09-13 Thread Steve Thomas
 .com will, of course, be a challenge.

 /htt[p|ps]:\/\/.*?\/.*\.com$/i

Correction! That should be:

/htt(p|ps):\/\/.*?\/.*\.com$/i

and slightly more efficient (doesn't capture backreference):

/htt(?:p|ps):\/\/.*?\/.*\.com$/i





Re: Fishing

2006-09-13 Thread hamann . w
 
 Bill Randle wrote:
  Amavisd-new will also drop attachments with a configurable list
  of file extentions, but the question refered to links to exe's,
  not actual exe attachments.
 
 Good point -- everyone's primed to think of attachments, it seems.
 
 Here's a stab at it: set up a URI rule.
 
 uri   EXECUTABLE_LINK/\.(?:exe|scr)$/i
 describe  EXECUTABLE_LINKLinks to an executable file
 score EXECUTABLE_LINK10
 
 Just a starting place, mind you -- you may want to make it more or less 
 specific.  And there may still be the occasional site running a binary 
 CGI on Windows, such that the server will execute the EXE and output 
 HTML, not offer the EXE for download.

A .scr probably would not be used as a cgi...

the other way round - it is very easy to create a php that offers an exe for 
download
So jut scoring direct .exe links might cause the bad guys to produce better 
download
links

Wolfgang Hamann

 
 .com will, of course, be a challenge.
 
 -- 
 Kelson Vibber
 SpeedGate Communications www.speed.net
 






Re: Fishing

2006-09-13 Thread Andreas Pettersson

Steve Thomas wrote:


/htt(?:p|ps):\/\/.*?\/.*\.com$/i

 



Why not /https?:\/\/.*?\/.*\.com$/i
?



Re: Ask about more detail rule description

2006-09-13 Thread jdow

From: John D. Hardin [EMAIL PROTECTED]

On Wed, 13 Sep 2006, Bowie Bailey wrote:


L's wrote:
 I want to know is there any document, that for the rules description
 in a simple words?
 
 Since the description generated by X-Spam-Report: is too technical.
 
 Does there any document description easier to read than:

 [http://spamassassin.apache.org/tests_3_1_x.html]

Those are the only descriptions we have for the rules.  If that is not
detailed enough, you'll have to take a look at the regex and see what 
it is doing.


...I doubt *that* will be less technical... :)

L's, which of the descriptions aren't clear? Admittedly some of them
are based on violations of the standards and are not going to be
meaningful to anyone who does not understand the standards, but what's
unclear about, for example, Message body has 90-100% blank lines ?

A suggestion: one wiki page for each rule where the description is too
short to be clear, on which the rule's meaning and rationale is
explained in whatever level of detail is considered adequate and
clear.


Thank you for volunteering to do it, John. That was a magnificent
gesture on your part.

{^_-}


Re: Broken Dependency

2006-09-13 Thread jdow

From: Jonathan Allen [EMAIL PROTECTED]


Theo,


There's nothing broken.  The message is informational, not an error. (it means
you don't have the DCC plugin loaded which would define that rule, which is
fine if you're not using DCC...)


Thank you for explaining that.  I have been trying to work out why some
spams made it through SA and found that in the log.  Running SA -t on
the (saved) spam told me: Broken Pipe and running it with -D as well
showed that it broke during base64 decoding - I assume the attachment,
a GIF, was badly encoded.  Otherwise it looked just the usual text +
something in a GIF spam.


Hm, a -t option that suppresses warnings might be nice, me suspects.
{^_^}


Re: Fishing

2006-09-13 Thread jdow

Visit Wiki. Look for ClamAVPlugin. To save you some effort:
http://wiki.apache.org/spamassassin/ClamAVPlugin

This uses ClamAV as a scanner for virus laden email.

SpamAssassin NEVER blocks email. You probably can, however, setup a
simple filter for .exe etc in your MDA. You certainly can do it with
procmail, for example. I found it more hassle than it was worth.

{^_^}
- Original Message - 
From: Fábio Gomes [EMAIL PROTECTED]



Hi list,

Is there any way to block messages with links to executables like *.exe,
*.com and *.scr?

Best Regards,
Fábio Gomes 



Re: Fishing

2006-09-13 Thread John D. Hardin
On 13 Sep 2006 [EMAIL PROTECTED] wrote:

 the other way round - it is very easy to create a php that offers
 an exe for download So jut scoring direct .exe links might
 cause the bad guys to produce better download links

True. As I said in an earlier post, scoring on bare executable URIs is
a low-hanging-fruit test.

Past that we stray into the realm of trying to analyze the URI vs. the
displayed link text to see if it looks like it is an attempt to mask a
hostile URI with a superficially trustworthy URI. Which has been
discussed here before.

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.-- Red Drag Diva
---
 4 days until The 219th anniversary of the signing of the U.S. Constitution



Re: Ask about more detail rule description

2006-09-13 Thread John D. Hardin
On Wed, 13 Sep 2006, jdow wrote:

  A suggestion: one wiki page for each rule where the description is too
  short to be clear, on which the rule's meaning and rationale is
  explained in whatever level of detail is considered adequate and
  clear.
 
 Thank you for volunteering to do it, John. That was a magnificent
 gesture on your part.

{bows}

Formatting suggestions welcomed.

Of course, *my* ideas about unclear may not lead to very many pages
in the wiki...

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.-- Red Drag Diva
---
 4 days until The 219th anniversary of the signing of the U.S. Constitution



Re: Ask about more detail rule description

2006-09-13 Thread John D. Hardin
On Wed, 13 Sep 2006, jdow wrote:

   [http://spamassassin.apache.org/tests_3_1_x.html]
  
  Those are the only descriptions we have for the rules.
  
  A suggestion: one wiki page for each rule where the description is too
  short to be clear, on which the rule's meaning and rationale is
  explained in whatever level of detail is considered adequate and
  clear.
 
 Thank you for volunteering to do it, John. That was a magnificent
 gesture on your part.

Request:

Can whoever maintains the tests_3_1_x.html page set up a link from
the description text for each rule to a wiki page for the rule name?

Thanks!

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The difference is that Unix has had thirty years of technical
  types demanding basic functionality of it. And the Macintosh has
  had fifteen years of interface fascist users shaping its progress.
  Windows has the hairpin turns of the Microsoft marketing machine
  and that's all.-- Red Drag Diva
---
 4 days until The 219th anniversary of the signing of the U.S. Constitution



Re: Fishing

2006-09-13 Thread Steve Thomas
 Steve Thomas wrote:

/htt(?:p|ps):\/\/.*?\/.*\.com$/i


 Why not /https?:\/\/.*?\/.*\.com$/i

Because I always forget that the question mark can be used that way, and
if I can't seem to remember it, nobody else gets to use it! That's why. :)

Nice catch.

Steve atrophying perl skills Thomas




Message containing bitmaps with random lines not being blocked

2006-09-13 Thread Robert S

I have been getting a large number of messages which are not being
blocked by SA.  Typically they contain a bitmapped text message with
things like THIS ONE JUST STARTED TRADING or CRITICAL INVESTOR
ALERT FOR   Below this there are several paragraphs of
meaningless sentences and there is also a bitmap containing a white
background and random lines.  This momentarily appears when I open the
message.

Is there a way that these can be blocked?  Typically they get a SA
score less than 5.

I'm running SA 3.1.3 on gentoo linux.   My configs:

$ cat /etc/spamassassin/local.cf
required_score  6
bayes_path  /var/work/bayes/bayes
bayes_file_mode 770
report_safe 0
use_dcc 1
DCC_dccifd_path /var/dcc/dccifd
dcc_timeout 10
add_header  all DCC _DCCB_: _DCCR_

$ cat /etc/spamassassin/v31*
loadplugin Mail::SpamAssassin::Plugin::Razor2
loadplugin Mail::SpamAssassin::Plugin::SpamCop
loadplugin Mail::SpamAssassin::Plugin::AWL
loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
loadplugin Mail::SpamAssassin::Plugin::WhiteListSubject
loadplugin Mail::SpamAssassin::Plugin::MIMEHeader
loadplugin Mail::SpamAssassin::Plugin::ReplaceTags
loadplugin Mail::SpamAssassin::Plugin::DCC


Re: Message containing bitmaps with random lines not being blocked

2006-09-13 Thread jdow

FuzzyOCR - visit the wiki plugins page. It helps.
{^_^}
- Original Message - 
From: Robert S [EMAIL PROTECTED]




I have been getting a large number of messages which are not being
blocked by SA.  Typically they contain a bitmapped text message with
things like THIS ONE JUST STARTED TRADING or CRITICAL INVESTOR
ALERT FOR   Below this there are several paragraphs of
meaningless sentences and there is also a bitmap containing a white
background and random lines.  This momentarily appears when I open the
message.

Is there a way that these can be blocked?  Typically they get a SA
score less than 5.

I'm running SA 3.1.3 on gentoo linux.   My configs:

$ cat /etc/spamassassin/local.cf
required_score  6
bayes_path  /var/work/bayes/bayes
bayes_file_mode 770
report_safe 0
use_dcc 1
DCC_dccifd_path /var/dcc/dccifd
dcc_timeout 10
add_header  all DCC _DCCB_: _DCCR_

$ cat /etc/spamassassin/v31*
loadplugin Mail::SpamAssassin::Plugin::Razor2
loadplugin Mail::SpamAssassin::Plugin::SpamCop
loadplugin Mail::SpamAssassin::Plugin::AWL
loadplugin Mail::SpamAssassin::Plugin::AutoLearnThreshold
loadplugin Mail::SpamAssassin::Plugin::WhiteListSubject
loadplugin Mail::SpamAssassin::Plugin::MIMEHeader
loadplugin Mail::SpamAssassin::Plugin::ReplaceTags
loadplugin Mail::SpamAssassin::Plugin::DCC


Re: Bayes conversion from DB to SQL question

2006-09-13 Thread Michael Parker
Tim Rosmus wrote:
 I've been running multiple in/out servers using Bayes and the local
 Bayes DB storage on the local machine[s].  Now I am moving Bayes
 to a site wide SQL setup.   My question is on the sa-learn backup/
 restore from DB to SQL...
 
 Should I backup/restore all local machine Bayes DB's to the central
 SQL server, or should I only pick one machine that seems to have
 the most actives Bayes DB, and just move that?
 

Pick the best one and use that.

Michael