Re: Running SA globally?

2004-12-20 Thread Terence Parker
Thanks for that.
I ran sa-learn using the amavis user and indeed, I now get much less 
spam than before. One thing I haven't done though is checked for 
false-positives on the server... that's just so much hassle (is there 
an easy way to do it?).

Anyways, for now sa-learn seems to have been very effective. You should 
try it!

Thanks,
Terence

On 19 Dec 04, at 6:10 AM, Sam Nilsson wrote:
Yes, your post makes sense. Since *all* filtered mail goes through 
amavisd-new, your amavisd-new setup *is* global.

Because amavisd-new *is* run globally as a certain user (for me this 
is user vscan), so must SA run under that same user since amavisd-new 
actually runs SA itself. My /etc/passwd entry for my vscan user reads:

vscan:*:1006:1007:Scanning Virus Account:/var/amavis:/bin/sh
The contents of /var/amavis are:
--snip--
So although I haven't needed to use the sa-learn command yet, when I 
do it will be after doing an su - vscan. This is true anytime I mess 
with any of the SA, Pyzor, Razor or Amavisd-new files. FYI, I also use 
Clamav, which I have also setup to run as the 'vscan' user.

Take Care
- Sam



Re: SA 3.0.2? Why no mail from announce@spamassassin.apache.org

2004-12-20 Thread jdow
You did indeed send an announcement to announce, dev, and users.
I received one copy, for the user's list. I'm not sure I am on the
announce list, though.

{^_^}



Re: Running SA globally?

2004-12-20 Thread Sam Nilsson
Terence Parker wrote:
Thanks for that.
I ran sa-learn using the amavis user and indeed, I now get much less 
spam than before. One thing I haven't done though is checked for 
false-positives on the server... that's just so much hassle (is there an 
easy way to do it?).

Anyways, for now sa-learn seems to have been very effective. You should 
try it!

Thanks,
Terence
Hi Terence,
Cool! I'm glad it had such a great effect. As far as checking for false 
positives, I guess it really depends on how much mail you have to 
process. My server is currently very low volume, so I set amavisd-new to 
quarantine the messages to my email account (over smtp).

Then I filter the mail into 'quarantined-spam' and 'quarantined-viruses' 
folders in my mail client. This way I can see the subject and look at 
the content if necessary. The thing is, with the spam cutoff level of 12 
in amavisd-new, I never once saw a false positive!

Obviously, the 'quarantine-to-email-account' method is not scalable at 
all. You could set up a script to give you a view of the quarantine. For 
instance you could generate an email that simply contains a list of 
message-ids, subjects and senders. This list could only include messages 
that scored between 6 and 15 or something to further streamline your view.

For something more simple, try using something like amavislogsumm 
(listed on the amavisd-new website in the contributed software section) 
to look at the senders and recievers of messages marked as spam as well 
as all kinds of other scoring statistics.

- Sam Nilsson


Re: SA 3.0.2? Why no mail from announce@spamassassin.apache.org

2004-12-20 Thread M.Lucas
On Sun, 2004-12-19 at 16:07 -0700, Alan Baxter wrote:
 On Sun, 19 Dec 2004 13:52:31 +0100, Maurice Lucas [EMAIL PROTECTED]
 wrote:
 By just checking the SA website I found out that there is a 3.0.2 release 
 from 2004-12-16.
 
 Why isn't there an announce from the announce list?
 
 The last announcement that I received from the SpamAssassin-announce
 list was the one for version 2.64 sent 4 August 2004.  Maybe it's not
 being used anymore.  Has anyone received more recent announcements?

The last announce I have is from 23-10-2004 announcement of SA 3.0.1

Maurice Lucas



Re: AWL confusion

2004-12-20 Thread Rich
 On Sun, 19 Dec 2004 17:02:06 -0500, Rich [EMAIL PROTECTED] wrote:
 So why on earth is a 17-score given to an address in an auto white-list?
 Shouldn't an address get a negative score (or, at least, a neutral zero)
 if it's in a WL?

 You may want to read up on the AWL in the WIKI - it explains exactly
 why you're seeing the scores you are.

Ahh, so AWL is not a white-list in any way - it's a sender history
score. That is quite misleading.

Rich


Re: No subject = not spam?

2004-12-20 Thread Marco van den Bovenkamp
Michael Weber wrote:
Should SA add a subject header if none exists and the message needs to
be marked?
Yes, it should. And in 3.0.2 it does. One of the things fixed in there.
--
Regards,
Marco.


Spamcop reporting and insecure dependency

2004-12-20 Thread Shane Williams
I recently upgraded a RedHat Enterprise Linux server to SA 3.0.1
(from, I think 2.64).  Everything went well, but now when I use
spamassassin -r, I get the following error:
Insecure dependency in connect while running with -T switch at
/usr/lib/perl5/5.8.0/i386-linux-thread-multi/IO/Socket.pm line 114.
Line 114 in my Socket.pm is:
if (!connect($sock, $addr)) {
I'm pretty sure this is related to spamcop reporting, because when I
run spamassassin -r on an email older than three days, I get the
error (well, not really error, I guess):
SpamCop - message older than 3 days, not reporting
but I never get the error about the insecure dependency.
Looking in the archives, I see at least one other message about a
similar problem. but no response that I could find.
Of course, as long as I'm on the subject of SpamCop reporting, perhaps
someone can enlighten me.  I was under the impression that one had to
register with spamcop before you could report emails to them.  Since I
don't have such a registration, what is SpamAssassin doing (or trying
to do) when I run spammassasin -r.  Is there a way to tell
spamassassin not to report to SpamCop when I run spamassassin -r?
--
Public key #7BBC68D9 at| Shane Williams
http://pgp.mit.edu/|  System Admin - UT iSchool
=--+---
All syllogisms contain three lines |  [EMAIL PROTECTED]
Therefore this is not a syllogism  | www.ischool.utexas.edu/~shanew


Re: AWL confusion

2004-12-20 Thread Bill Landry
- Original Message - 
From: Rich [EMAIL PROTECTED]

  On Sun, 19 Dec 2004 17:02:06 -0500, Rich [EMAIL PROTECTED] wrote:
  So why on earth is a 17-score given to an address in an auto
white-list?
  Shouldn't an address get a negative score (or, at least, a neutral
zero)
  if it's in a WL?
 
  You may want to read up on the AWL in the WIKI - it explains exactly
  why you're seeing the scores you are.

 Ahh, so AWL is not a white-list in any way - it's a sender history
 score. That is quite misleading.

I agree, and that why I thought that auto weight leveling was a more
appropriate and correctly descriptive name than auto whitelist.  But
that's just my 2 cents...

Bill



RE: more spam gets through since SA 3.x -- Beg to differ

2004-12-20 Thread Steve Bondy
Amavisd-new is leaving off some tests?
I wonder if in this case the perl version is not so much about SA, but
about amavisd-new.
The amavisd-new web site recommends perl 5.8.2 or better.

Just a guess

Steve

 Hi Richard,
 
  Perl 5.6.1 simply didn't work properly.  It caused Amavis  SA to 
  strip
  emails of their subject lines, failed to scan properly, had real 
  problems with spawning multiple processes, and reported errors in 
  spamassassin --lint --debug, many of which I was eventually able to 
  resolve; but the worst problems remained, including an 
 inability to run 
  DCC.
 
 hm... that's interesting. I just checked the top-level 
 INSTALL file and 
 it says that Perl 5.6.1 is just fine. When I manually invoke 
 spamc, it 
 scans just fine. amavisd-new leaves out some tests. Lint does 
 not tell 
 me about any errors.
 
 Florian
 


Interesting NW article

2004-12-20 Thread Jerry Bell
There's a big review of anti-spam products at nw fusion here:
http://www.nwfusion.com/reviews/2004/122004spampkg.html?ts
Here's a bit on spamassassin:
http://www.nwfusion.com/reviews/2004/122004spamside6.html

It's a pretty disappointing article.

Jerry
http://www.syslog.org



Spam processing errors

2004-12-20 Thread Joe Zitnik
I know I saw this in a previous thread, but for the life of me I can not
find it.  I saw some postings where people were reporting that SA was
only processing every other e-mail, or not processing all e-mail.  Was
this the correct list, and if so, can someone point me to the problem
and solution, AND most importantly: Happy Holidays to all on the list.


Re: more spam gets through since SA 3.x -- Beg to differ

2004-12-20 Thread Richard Ozer
Thanks.. I think that's right Steve.  I realized this last night while I was thinking 
about these comments.  I had updated Amavis to the latest version as well...

RO
Steve Bondy wrote:
Amavisd-new is leaving off some tests?
I wonder if in this case the perl version is not so much about SA, but
about amavisd-new.
The amavisd-new web site recommends perl 5.8.2 or better.
Just a guess
Steve

Hi Richard,

Perl 5.6.1 simply didn't work properly.  It caused Amavis  SA to 
strip
emails of their subject lines, failed to scan properly, had real 
problems with spawning multiple processes, and reported errors in 
spamassassin --lint --debug, many of which I was eventually able to 
resolve; but the worst problems remained, including an 
inability to run 

DCC.
hm... that's interesting. I just checked the top-level 
INSTALL file and 
it says that Perl 5.6.1 is just fine. When I manually invoke 
spamc, it 
scans just fine. amavisd-new leaves out some tests. Lint does 
not tell 
me about any errors.

Florian


Parsing of undecoded UTF-8

2004-12-20 Thread Claus Atzenbeck
Hi,

I'm running Mac OS 10.3.7. I had to make a clean install and therefore I
also had to install SpamAssassin again. I was following the description
given at http://www.stupidfool.org/docs/sa.html.

I used to run SpamAssassin 2.6x. Now, I have version 3.0.2. Everything
seems to work, but sometimes sa-learn writes the following error while
parsing my mbox files:

Parsing of undecoded UTF-8 will give garbage when decoding
entities at ///Library/Perl/5.8.1/Mail/SpamAssassin/HTML.pm line
182.

I have never seen that error under my previous used version.

Did I miss to install a specific Perl module?

Thanks for any hint!
Claus


Re: SpamAssassin doesn't parse any email

2004-12-20 Thread Oleksandr Samoylyk
Changed my local.cf to:

required_score 5.0
rewrite_header Subject *SPAM*
report_safe 1
lock_method flock

[EMAIL PROTECTED] spamassassin]# /usr/bin/./spamassassin --lint
[EMAIL PROTECTED] spamassassin]#

Still it doesn't parse emails :(.

What it can be?

-- 
 Oleksandr Samoylyk



Re: Interesting NW article

2004-12-20 Thread Theo Van Dinter
On Mon, Dec 20, 2004 at 10:41:33AM -0500, Jerry Bell wrote:
 Here's a bit on spamassassin:
 http://www.nwfusion.com/reviews/2004/122004spamside6.html
 It's a pretty disappointing article.

Very much so:

Forbidden

You don't have permission to access /reviews/2004/122004spamside6.html on this
server.

Additionally, a 403 Forbidden error was encountered while trying to use an
ErrorDocument to handle the request.

-- 
Randomly Generated Tagline:
As King Arthur said: Some days it all seems so feudal.


pgpmr2zF3fVtr.pgp
Description: PGP signature


Re: SpamAssassin doesn't parse any email

2004-12-20 Thread Theo Van Dinter
On Mon, Dec 20, 2004 at 06:12:19PM +0200, Oleksandr Samoylyk wrote:
 [EMAIL PROTECTED] spamassassin]# /usr/bin/./spamassassin --lint
 [EMAIL PROTECTED] spamassassin]#
 
 Still it doesn't parse emails :(.
 What it can be?

What do you mean doesn't parse emails?  How are you calling SpamAssassin?

-- 
Randomly Generated Tagline:
... although it's better if you call it an osculating circle because nobody 
 knows what it means.  Except those smarty-pants math professors...
  - Prof. Farr


pgpQFBf0MngRD.pgp
Description: PGP signature


Re: Interesting NW article

2004-12-20 Thread Marco van den Bovenkamp
Theo Van Dinter wrote:
Very much so:
Forbidden
You don't have permission to access /reviews/2004/122004spamside6.html on this
server.
Additionally, a 403 Forbidden error was encountered while trying to use an
ErrorDocument to handle the request.
?? Works here...
--
Regards,
Marco.


Re: Interesting NW article

2004-12-20 Thread Jerry Bell
Very strange.  The link still works for me and everyone I've asked to try
it.  Maybe they're doing some sort of server side blocking?

Here's a snippet from the article:
The short answer is that no one submitted it, but of course there's more
to it than that. This year we reached out to the SpamAssassin community
and asked them to participate. Although a few well-meaning souls
volunteered to be the contacts for SpamAssassin, when it came time to test
no one would step up to the plate and represent the product at a level
that would make it competitive to the other enterprise-focused vendors.

They do talk favorably of spamassassin in a few parts, but overall they
seemed to have missed the boat.

Jerry
http://www.syslog.org

 On Mon, Dec 20, 2004 at 10:41:33AM -0500, Jerry Bell wrote:
 Here's a bit on spamassassin:
 http://www.nwfusion.com/reviews/2004/122004spamside6.html
 It's a pretty disappointing article.

 Very much so:

 Forbidden

 You don't have permission to access /reviews/2004/122004spamside6.html on
 this
 server.

 Additionally, a 403 Forbidden error was encountered while trying to use an
 ErrorDocument to handle the request.

 --
 Randomly Generated Tagline:
 As King Arthur said: Some days it all seems so feudal.





Re: Interesting NW article

2004-12-20 Thread Theo Van Dinter
On Mon, Dec 20, 2004 at 11:27:23AM -0500, Jim Maul wrote:
 Forbidden
 You don't have permission to access /reviews/2004/122004spamside6.html on 
 this server.

 Works for me.

Hrm.  Apparently they're just blocking all of my employer's IPs.  I can
get to the page from my home machine, but both work's office and colo
IP ranges get 403s for all requests.

-- 
Randomly Generated Tagline:
To win, you must treat a pressure situation as an opportunity to succeed,
 not an opportunity to fail. - Gardner Dickinson


pgpVZWIrgmsgv.pgp
Description: PGP signature


Re: Interesting NW article

2004-12-20 Thread Tim Donahue
In case anyone else is having problems as well here is the SA-related
portion of the review. 

Tim Donahue


Where's SpamAssassin?
By Joel Snyder 
Network World, 12/20/04

The short answer is that no one submitted it, but of course there's
more to it than that. This year we reached out to the SpamAssassin
community and asked them to participate. Although a few well-meaning
souls volunteered to be the contacts for SpamAssassin, when it came time
to test no one would step up to the plate and represent the product at a
level that would make it competitive to the other enterprise-focused
vendors. 

Interest in SpamAssassin is understandable. In the small-business
market, the open source SpamAssassin dominates many anti-spam systems.
When well tuned and integrated by a value-added reseller (VAR) that
knows what it is doing, it turns out to be a very effective system.
SpamAssassin users routinely report 100% spam reduction and 0% false
positives (although these self-reported statistics are probably biased),
and are generally overjoyed with the results.

By itself, SpamAssassin is little more than the software implementation
of an interesting idea: apply statistics, neural networks and Bayesian
probabilities to the problem of classifying mail as spam or not. Train
the engine by giving it desirable and undesirable mail, and it can tell
you for each new message what pile it most resembles. It turns out to
work astonishingly well, especially in small businesses where mail flow
is very homogeneous. SpamAssassin's Bayesian engine even redefines the
meaning of spam by letting you say, This is the mail I want, and This
mail I don't want. SpamAssassin also mixes other tools into its scoring
system, such as DNS-based blacklists and collaborative scoring, as well
as more traditional keyword searches and formatting tests. 

The key to SpamAssassin's success, though, is a smart VAR or IT person
installing it. SpamAssassin requires a significant amount of integration
work to make an enterprise-class installation succeed. Without a GUI,
database, quarantine, anti-virus scanner, policy or per-user
configuration, SpamAssassin is a great tool for those who want to build
their own anti-spam system, but is in no way a solution by itself. 

This doesn't mean that SpamAssassin wasn't well represented in our test.
The important core of SpamAssassin, a Bayesian engine, was recognizable
in at least one-third of the products we tested and might well have been
hidden in the guts of more. The strategy of combining multiple tests to
identify spam is in nearly all modern, anti-spam products, including
SpamAssassin. 

The difficulty in testing or recommending products that require heavy
engine training, or ones based on trained neural networks, is that
companies with many employees have very diverse mail flows, and the
training will likely generate false positives or negatives across large
numbers of users. For example, a multinational company might have many
employees who don't read or speak Italian, and might train all their
Italian mail as spam - something that would upset the Milan and Rome
offices. Or imagine IDG, which owns many publications, all which have
specialized vocabularies. No one set of training mail would work for the
different communities. 

Products that successfully include a Bayesian recognizer, such as
SpamAssassin, do so by considering it as one factor in the larger
cocktail of spam identification. By weighting the Bayesian verdict with
other information, vendors have followed the trail that SpamAssassin
blazed and made it enterprise-ready.



On Mon, 2004-12-20 at 11:12 -0500, Theo Van Dinter wrote:
 On Mon, Dec 20, 2004 at 10:41:33AM -0500, Jerry Bell wrote:
  Here's a bit on spamassassin:
  http://www.nwfusion.com/reviews/2004/122004spamside6.html
  It's a pretty disappointing article.
 

 Additionally, a 403 Forbidden error was encountered while trying to use an
 ErrorDocument to handle the request.




Re: SpamAssassin doesn't parse any email

2004-12-20 Thread Oleksandr Samoylyk
 On Mon, Dec 20, 2004 at 06:12:19PM +0200, Oleksandr Samoylyk wrote:
 [EMAIL PROTECTED] spamassassin]# /usr/bin/./spamassassin --lint
 [EMAIL PROTECTED] spamassassin]#
 
 Still it doesn't parse emails :(.
 What it can be?

 What do you mean doesn't parse emails?

It doesn't check emails for spam. So it's running but doesn't do its
work.

 How are you calling SpamAssassin?

/usr/bin/spamd -d -c -m 5

More about my environment:

I use Exim as MTA.

In exim.conf I have:

# Spam Assassin
spamcheck_director:
  driver = accept
  condition = ${if and { \
  {!def:h_X-Spam-Flag:} \
  {!eq {$received_protocol}{spam-scanned}} \
  {!eq {$received_protocol}{local}} \
  
{exists{/home/${lookup{$domain}lsearch{/etc/virtual/domainowners}{$value}}/.spamassassin/user_prefs}}
 \
} {1}{0}}
  retry_use_local_part
  transport = spamcheck
  no_verify

and

spamcheck:
  driver = pipe
  batch_max = 100
  command = /usr/sbin/exim -oMr spam-scanned -bS
  current_directory = /tmp
  group = mail
  home_directory = /tmp
  log_output
  message_prefix = 
  message_suffix = 
  return_fail_output
  no_return_path_add
  transport_filter = /usr/bin/spamc -u 
${lookup{$domain}lsearch*{/etc/virtual/domainowners}{$value}}
  use_bsmtp
  user = mail

BTW, my full Exim configuration file  init script are in attachment.

-- 
 Oleksandr Samoylyk

exim.conf
Description: Binary data


spamassassin
Description: Binary data


RE: AWL confusion

2004-12-20 Thread Chris Blaise

I agree it's a very misleading term.  

The easiest and most appropriate term I've heard is historical
averaging. 

-Original Message-
From: Bill Landry [mailto:[EMAIL PROTECTED] 
Sent: Monday, December 20, 2004 7:51 AM
To: users@spamassassin.apache.org
Subject: Re: AWL confusion

- Original Message -
From: Rich [EMAIL PROTECTED]

  On Sun, 19 Dec 2004 17:02:06 -0500, Rich [EMAIL PROTECTED] wrote:
  So why on earth is a 17-score given to an address in an auto
white-list?
  Shouldn't an address get a negative score (or, at least, a neutral
zero)
  if it's in a WL?
 
  You may want to read up on the AWL in the WIKI - it explains exactly 
  why you're seeing the scores you are.

 Ahh, so AWL is not a white-list in any way - it's a sender history
 score. That is quite misleading.

I agree, and that why I thought that auto weight leveling was a more
appropriate and correctly descriptive name than auto whitelist.  But
that's just my 2 cents...

Bill




Re: Interesting NW article

2004-12-20 Thread Kenneth Porter
--On Monday, December 20, 2004 11:29 AM -0500 Jerry Bell 
[EMAIL PROTECTED] wrote:

They do talk favorably of spamassassin in a few parts, but overall they
seemed to have missed the boat.

From the article:

The important core of SpamAssassin, a Bayesian engine, was recognizable
in at least one-third of the products we tested and might well have been
hidden in the guts of more. The strategy of combining multiple tests to
identify spam is in nearly all modern, anti-spam products, including
SpamAssassin.
Seems like they have it backwards. Bayes is a component, not the core.
Also, SA is a component, not a complete solution. With 41 participants in 
the survey, it would be surprising not to find SA integrated into some of 
them. Perhaps some here can identify which products?

Here's the list of participants:
http://www.nwfusion.com/bg/2003/spam/index.jsp


RE: Interesting NW article

2004-12-20 Thread Carnegie, Martin
Well, from our implementation I would say that this article is junk.  We
are running SA with pretty much default config and no Bayes and are
getting about 97% with the only FPs being some mass mailings from
vendors (MS Technet for example).  If we looked at turning on Bayes then
this product would probably be the best out there. 

This quote SpamAssassin requires a significant amount of integration
work to make an enterprise-class installation succeed is bs, we did the
upgrade from 2.64 which worked great and have not seen any issues and
the amount of work to implement was about an hour.

So keep up the great work guys and ignore these technical reviews.




Re: Interesting NW article

2004-12-20 Thread C-Store Christoph Peter
I agree. After some minor issues SA works perfect for us. It runs perfect on 
a small PPro 200 machine, and gets almost 100 %. I have one or two spam 
mails getting through.

I´m pretty happy with SA.
Cheers,
C-Store Hard- und Software GmbH
Christoph Peter
Düstere Straße 20
37073 Göttingen
http://www.c-store.de
[EMAIL PROTECTED]
- Original Message - 
From: Carnegie, Martin [EMAIL PROTECTED]
To: users@spamassassin.apache.org
Sent: Monday, December 20, 2004 6:22 PM
Subject: RE: Interesting NW article


Well, from our implementation I would say that this article is junk.  We
are running SA with pretty much default config and no Bayes and are
getting about 97% with the only FPs being some mass mailings from
vendors (MS Technet for example).  If we looked at turning on Bayes then
this product would probably be the best out there.
This quote SpamAssassin requires a significant amount of integration
work to make an enterprise-class installation succeed is bs, we did the
upgrade from 2.64 which worked great and have not seen any issues and
the amount of work to implement was about an hour.
So keep up the great work guys and ignore these technical reviews.




Re: more spam gets through since SA 3.x -- Beg to differ

2004-12-20 Thread Florian Effenberger
Hi Steve,
Amavisd-new is leaving off some tests?
I wonder if in this case the perl version is not so much about SA, but
about amavisd-new.
The amavisd-new web site recommends perl 5.8.2 or better.
thanks, will check that!
Florian


Re: Interesting NW article

2004-12-20 Thread Marco van den Bovenkamp
Kenneth Porter wrote:
Also, SA is a component, not a complete solution. With 41 participants 
in the survey, it would be surprising not to find SA integrated into 
some of them. Perhaps some here can identify which products?
In the article 
(http://www.nwfusion.com/reviews/2004/122004spamside2.html) they mention 
the Roaring Penguin product (CanIt) and that of Paessler GmbH (No Spam 
Today!) as based on SA.

--
Regards,
Marco.


low scores?

2004-12-20 Thread Rich
I have recently upgrades from 2.x to 3.0.1 and have been watching the
scores for stuff that is real spam. I had a bunch of up-weighted scores in
2.x but I didn't move those over to the new version while I evaluated what
the new version was doing. What I don't understand are what seem to be
extremely low scores for various tests, for instance this is the report:

Content analysis details:   (1.9 points, 5.0 required)
  
 pts rule name 
description
-- ---
0.0 HTML_40_50 BODY: Message is 40% to 50% HTML   
0.0 HTML_MESSAGE   BODY: HTML included in
message  1.9 BAYES_99  
BODY: Bayesian spam probability is 99 to 100%
[score: 1.]

on a message that had a content preview of:

Content preview:  a href=http://imsodamtired.com/?wid=100049; Why b u  
  y from World Wide Meds?brbr # No Prescription
Requiredbr #   Discrete  Confidential
Packag i n gbr # World Wide Shippingbr #  
Quality Generic Medi.c.ationsbr # 1 0 0 % M0ney Back Guarant e ebr
/a brbrbrbrbrbr a

etc. (i.e. no-doubt-about-it spam) yet there are zero scores for the two
HTML tests and only! 1.9 for the BAYES_99 test. I don't run any network
tests because I'm behind a corporate firewall and they are unreliable in
this environment.

My question is why are these score so low? If 5 is a typical spam/ham
these messages should be scoring close to that based on the bayes_99
alone.

If the engine is expecting to be able to use network tests for these then
shouldn't the default scores be higher if those tests are turned off?

Rich


RE: Interesting NW article

2004-12-20 Thread Chris Santerre


-Original Message-
From: Carnegie, Martin [mailto:[EMAIL PROTECTED]
Sent: Monday, December 20, 2004 12:23 PM
To: users@spamassassin.apache.org
Subject: RE: Interesting NW article


Well, from our implementation I would say that this article is 
junk.  We
are running SA with pretty much default config and no Bayes and are
getting about 97% with the only FPs being some mass mailings from
vendors (MS Technet for example).  If we looked at turning on 
Bayes then
this product would probably be the best out there. 

This quote SpamAssassin requires a significant amount of integration
work to make an enterprise-class installation succeed is bs, 
we did the
upgrade from 2.64 which worked great and have not seen any issues and
the amount of work to implement was about an hour.

So keep up the great work guys and ignore these technical reviews.

Completely agree. We don't use Bayes, and we catch 99%.  Who did these
people contact? 

SA is not that difficult at all to integrate. I think they confuse the
abondance of options, as difficult. 

--Chris


Re: salearn parsing error

2004-12-20 Thread Thomas Arend
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Am Montag, 20. Dezember 2004 14:36 schrieb Rich:
 Nobody commented on the first mention of this so I'm repeating it:

 I move low-scoring but real spam to a folder and then run salearn on it.
 Some messages trigger the following error:

 Parsing of undecoded UTF-8 will give garbage when decoding entities at
 /usr/local/lib/perl5/site_perl/5.8.6/Mail/SpamAssassin/HTML.pm line 182

 Why isn't salearn handling these messages correctly?

 Rich

Which version do you use?

Thomas
- -- 
icq:133073900
aim:tawhv
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFBxxmDHe2ZLU3NgHsRAm+0AJ0SnzdCMbDcUdyej9iKCso1R2WZCQCdF9Qw
VuFkB5V+shaFpLUq1z75BgA=
=sS0h
-END PGP SIGNATURE-


Re: low scores?

2004-12-20 Thread Jim Maul
Rich wrote:
I have recently upgrades from 2.x to 3.0.1 and have been watching the
scores for stuff that is real spam. I had a bunch of up-weighted scores in
2.x but I didn't move those over to the new version while I evaluated what
the new version was doing. What I don't understand are what seem to be
extremely low scores for various tests, for instance this is the report:
Content analysis details:   (1.9 points, 5.0 required)
  
 pts rule name 
description
-- ---
0.0 HTML_40_50 BODY: Message is 40% to 50% HTML   
0.0 HTML_MESSAGE   BODY: HTML included in
message  1.9 BAYES_99  
BODY: Bayesian spam probability is 99 to 100%
[score: 1.]

on a message that had a content preview of:
Content preview:  a href=http://imsodamtired.com/?wid=100049; Why b u  
  y from World Wide Meds?brbr # No Prescription
Requiredbr #   Discrete  Confidential
Packag i n gbr # World Wide Shippingbr #  
Quality Generic Medi.c.ationsbr # 1 0 0 % M0ney Back Guarant e ebr
/a brbrbrbrbrbr a

etc. (i.e. no-doubt-about-it spam) yet there are zero scores for the two
HTML tests and only! 1.9 for the BAYES_99 test. I don't run any network
tests because I'm behind a corporate firewall and they are unreliable in
this environment.
My question is why are these score so low? If 5 is a typical spam/ham
these messages should be scoring close to that based on the bayes_99
alone.
If the engine is expecting to be able to use network tests for these then
shouldn't the default scores be higher if those tests are turned off?
Rich

The SA scores are generated based on the scores of other rules and takes 
into account overlap of certain rules.  From what i understand, BAYES_99 
is scored what it is because a lot of messages that triggered this rule 
also triggered other rules and as such the score for it was lowered.  If 
you dont run this other rules however (i would imagine network tests 
would be some of them) then i would suggest you bump up the scores for 
the tests you are running to compensate for the lack of other tests 
being run.  This is exactly what i did.  My BAYES_99 has been running at 
4.5 with no problems for a while now.  The ability to change the scores 
of tests is there for exactly this reason - because everyones system is 
different.  Dont be afraid to override the defaults, but be sure to 
watch closely after you do to check for false positives.

-Jim


Re: salearn parsing error

2004-12-20 Thread Rich
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 Am Montag, 20. Dezember 2004 14:36 schrieb Rich:
 Nobody commented on the first mention of this so I'm repeating it:

 I move low-scoring but real spam to a folder and then run salearn on it.
 Some messages trigger the following error:

 Parsing of undecoded UTF-8 will give garbage when decoding entities at
 /usr/local/lib/perl5/site_perl/5.8.6/Mail/SpamAssassin/HTML.pm line 182

 Why isn't salearn handling these messages correctly?

 Rich

 Which version do you use?

3.0.1


RE: Bayes question

2004-12-20 Thread Steve Bondy
Just because you learn something as spam doesn't mean it will be
blocked.
SA will add a score to the message based on the bayes rules, but if the
bayes rules are the only ones that get hit, and they score less than
your threshold, it won't keep the stuff out.
For example, the default score in 2.6.x for BAYES_90 is either 2.454 or
2.101.  If that's the only rule you hit, and your threshold is above
those numbers, it will come through.

 -Original Message-
 From: Chuck Campbell [mailto:[EMAIL PROTECTED] 
 Sent: Monday, December 20, 2004 12:02 PM
 To: SpamAssassin Users
 Subject: Bayes question
 
 
 Lately I've been seeing lots of very similar spams get 
 through my 2.6.3 I don't run autolearn, but I save my spam 
 and ham daily, and run them through sa-learn -spam and -ham 
 respectively.
 
 I'm puzzled why a spam I've manually learned via sa-learn 
 keeps coming through.
 
 What can I check to ensure things are working properly?
 
 BTW, I know I should upgrade, but time isn't available right 
 now, and this setup is catching more than 99.5 percent of the 
 spam coming in.  I'm just curious about bayes not working as 
 expected any longer, although it still catches LOTS of 
 others, so that can't be it completely...
 
 baffled,
 -chuck
 
 


Re: Interesting NW article

2004-12-20 Thread Jon Drukman
Jerry Bell wrote:
Here's a snippet from the article:
The short answer is that no one submitted it, but of course there's more
to it than that. This year we reached out to the SpamAssassin community
and asked them to participate. Although a few well-meaning souls
volunteered to be the contacts for SpamAssassin, when it came time to test
no one would step up to the plate and represent the product at a level
that would make it competitive to the other enterprise-focused vendors.
They do talk favorably of spamassassin in a few parts, but overall they
seemed to have missed the boat.
seems to me like their audience is the point n drool type of network 
admin.  if it doesn't have a setup.exe they classify it as hard to maintain.

yes, you have to read a manpage to set up spamassassin.  yes there are 
many different configs based on the various types of MTAs you can use, 
and whether you want side-wide or per-user configs.  you may have to 
adjust scoring values to get a truly effective config.  it took me maybe 
an hour to get my head around how to set up a per-user config with 
postfix and spamd.  considering the amount of time it's saved me and my 
friends dealing with spam, i'd say it was an excellent time investment.



RE: Interesting NW article

2004-12-20 Thread Tim Donahue
On Mon, 2004-12-20 at 13:31 -0500, Chris Santerre wrote:
 
 Completely agree. We don't use Bayes, and we catch 99%.  Who did these
 people contact? 
 
 SA is not that difficult at all to integrate. I think they confuse the
 abondance of options, as difficult. 
 
 --Chris

I personally think that the best part of SA is its ability to be
integrated in so many different ways and its ability to scale.  It can
scale all the way from just a single user (using procmail, for example)
all the way to a large corporate network using something like Maia
Mailguard and amavisd-new.  Maia works with amavisd-new to allow users
to control their email preferences, such as personal whitelists, and do
things like access their quarantined spams to recover (and report for
Bayes training if desired) false postives.


-- 
Tim Donahue [EMAIL PROTECTED]
Haynes Group, Incorporated



OT Boincing Spam

2004-12-20 Thread ChupaCabra
My boss is twisting off today because he got 350 messages marked [SPAM] 
over the weekend.  His Reaction is to Bounce em all, Let the isps sort 
it out.  I tried explaining about forged headers and the myriad of 
other methods spammers use to look like they come from someplace else.  
Apparantly he feels like I am blowing smoke.

Does anyone have some good links fo why it is not a good idea to bounce 
spam?  I am getting nowhere with my speil.  Untill he hears it from 
somewhere else I am in s--t city.

I can see where he gets the idea in that I still see people on the 
internets saying bouncing it is good but in all my readings I have 
learned better.  Or does anyone think bouncing all spam is a good idea.

Thanks ahead.
--
Michael H. Collins  Admiral, Penguinista Navy
http://linuxlink.com
/\ASCII Ribbon Campaign
\ / No HTML/RTF in email
x   No Word docs in email
/ \ Respect for open standards
Take your laptop and yell out: 
Can a brother get a ip address?




Re: OT Boincing Spam

2004-12-20 Thread Ralf Hildebrandt
* ChupaCabra [EMAIL PROTECTED]:
 My boss is twisting off today because he got 350 messages marked [SPAM] 
 over the weekend.  His Reaction is to Bounce em all, Let the isps sort 
 it out.  I tried explaining about forged headers and the myriad of 
 other methods spammers use to look like they come from someplace else.  
 Apparantly he feels like I am blowing smoke.
 
 Does anyone have some good links fo why it is not a good idea to bounce 
 spam?

Bounce where?

-- 
Ralf Hildebrandt (i.A. des IT-Zentrum)  [EMAIL PROTECTED]
Charite - Universitätsmedizin BerlinTel.  +49 (0)30-450 570-155
Gemeinsame Einrichtung von FU- und HU-BerlinFax.  +49 (0)30-450 570-962
IT-Zentrum Standort CBF send no mail to [EMAIL PROTECTED]


Re: OT Boincing Spam

2004-12-20 Thread Duncan Hill
On Monday 20 December 2004 20:49, ChupaCabra wrote:
 I can see where he gets the idea in that I still see people on the
 internets saying bouncing it is good but in all my readings I have
 learned better.  Or does anyone think bouncing all spam is a good idea.

Backscatter will get you blacklisted these days - there's enough junk mail on 
the net that backscatter doesn't help.

Crank up your SMTP rejections if you can - greylisting works quite well for 
the hit-and-run spammer who doesn't use a real SMTP server to send the spam.  
Add in things like 'don't say HELO with my IP or name, or with a reserved IP' 
and you're doing well.

Finally, look at using something like amavisd-new and quarantine mode - spam 
with a score  n gets quarantined instead of passed through.  There are 
commercial products that will do this, along with Maia MailGuard and a few 
others, including a plugin for squirrelmail.


sa database transferable?

2004-12-20 Thread Andy Hester








I have just built a new spam filter with
postfix/amavisd/spamassassin to replace our old
sendmail/mimedefang/spamassassin spam filter which was buckling under the
load. Can I copy the sa databases over to the new filter to help my
new filter learn? If not, any ideas on how I can train my new system from
the old one. I dont have a large number of spam messages on the
old machine and Im concerned about going live with my new server.
The last time I checked the old system had processed about 75K messages in one
day for a system with approx 50 users. I dont want my users to get
bombarded (or my Exchange server to crash and burn) while my new filter
learns.



Any ideas or help welcomed.



Thanks,

Andy Hester

Network Engineer

Galactic, LTD










Re: OT Boincing Spam

2004-12-20 Thread Mike
On Mon, 20 Dec 2004 14:49:59 -0600, ChupaCabra [EMAIL PROTECTED] wrote:
  Or does anyone think bouncing all spam is a good idea.
 
 Thanks ahead.
 
 --
 Michael H. Collins  Admiral, Penguinista Navy
 

Bouncing spam will do two things. First, it'll generate a lot of
useless traffic, which may or may not cost you money, but will
(slightly) increase the costs for everyone who's networks your bounces
transit.

The other thing it will do is queue up a lot of email in your outbound
MTA queue. A lot of spam is sent with completely bogus
usernames/domains/etc.., or is sent from domains that refuse
connections to their MX records. You'll spend a fair amount of time
purging invalid email out of your mail queue, which tends to be boring
:)

We did bounce spam for a while, but have since just let the end users
decide what to do with it. Ultimately, this is the best solution, as
what may be good for one person, may not be an option for another.
It's not hard to create a rule to delete all email heading to your
boss that is marked spam.

Mike


Re: sa database transferable?

2004-12-20 Thread Michele Neylon::Blacknight Solutions
Andy Hester wrote:
I have just built a new spam filter with postfix/amavisd/spamassassin  
to replace our old sendmail/mimedefang/spamassassin  spam filter which 
was buckling under the load.  Can I copy the sa databases over to the 
new filter  to help my new filter learn?  If not, any ideas on how I can 
train my new system from the old one.  I dont have a large number of 
spam messages on the old machine and Im concerned about going live with 
my new server.  The last time I checked the old system had processed 
about 75K messages in one day for a system with approx 50 users.  I 
dont want my users to get bombarded (or my Exchange server to crash and 
burn)  while my new filter learns.
If both systems are running the same version of SA then the Bayes 
versions should be the same and copying across should not be a problem

YMMV

--
Email scanned by Blacknight for viruses and dangerous content.
Visit http://www.blacknight.ie for more information


Re: OT Boincing Spam

2004-12-20 Thread ChupaCabra

Evan Platt wrote:
Evan Platt said:
 

I don't have a link for you, but tell your boss to imagine if someone
decided to dictionary attack every ISP they could find, using not only
dictionary words, but every combination of letters up to 9 letters, i.e.
a, b, c, etc up to z for every ISP they
can find. And tell your boss that they intend to use HIS address as the
reply-to address for the spam. Now ask him if he still thinks it's a good
idea for ISP's to 'bounce' spam to this unintended victim - him.
   

Let me follow up to myself (please allow myself to introduce... myself.) I
posted a message to a yahoo group last week. A few minutes later, I get a
e-mail that my message has been marked as Spam by some software, and if I
wish to confirm my identity, I must click on a link to that companies web
site (tracking numbers and all that in the URL). And, of course, this will
add me to the persons allowed list so I won't have to do it again.
Needless to say, I will NOT do that. This company could then sell its
lists of CONFIRMED addresses for a goldmine.
I then posted to the list, asked if anyone else had received this message,
and a number of people did, and for the most part, no one clicked on the
link. So now there's some 1d10t wondering why he's not getting any mail. I
know this isn't your boss'es intention, but it sounds like he wants
anything marked as spam deleted? Not a good idea, IMHO.
(Baby, bathwater).
Evan
 

First he wanted that.  I did it but actually kept em all.  So then his 
partner didn't get an urgent email so it was turned back to the users to 
decide.  I get a different kneejerk each week.  What fun dealing with an 
80 yo ex military man.  This am it was Lets spambomb every isp that 
sends spam  and maybe *they*  will do something about it.  And screw the 
rest of the world too.  America owns the internet.  Fsck Em, they would 
all die without the american economy, etc.

--
Michael H. Collins  Admiral, Penguinista Navy
http://linuxlink.com
/\ASCII Ribbon Campaign
\ / No HTML/RTF in email
x   No Word docs in email
/ \ Respect for open standards
Take your laptop and yell out: 
Can a brother get a ip address?




Re: OT Boincing Spam

2004-12-20 Thread ChupaCabra

shane mullins wrote:
Could you just discard it?
 

I was till a couple of vips lost important email.  I was actually 
keeping it all because I knew better.

 



RE: OT Boincing Spam

2004-12-20 Thread Ring, John C
My boss is twisting off today because he got 350 messages marked [SPAM] 
over the weekend.  His Reaction is to Bounce em all, Let the isps sort 
it out.

And then when a spammer sends tons of e-mail to your site forged as, say,
[EMAIL PROTECTED], you stand a good chance IBM may end up blocking all
email from your site, at least for a while.

Also, check the terms of service with your ISP.  I suppose it is possible
they might consider such a configuration as abusive, and could use it as
grounds to terminate your service.  Plenty of people are savvy enough to
report spam to the ISP of the sender.  (In this case, you would in effect be
that sender.)

What you want to do, IMO, is run SpamAssassin during the SMTP session, such
as with http://duncanthrax.net/exiscan-acl/.  Then reject messages with a
very high score, but simply label and deliver messages for a lower spam
score.  For example, reject mail scoring 15 or higher, mark and deliver if 5
or over, and classify as ham if the score is below 5.

Other MTAs provide other ways of doing this as well.



-- 
John C. Ring, Jr. 
[EMAIL PROTECTED] 
Network Engineer
Union Switch  Signal Inc.

If all mankind minus one, were of one opinion,  only one person of the
contrary opinion, mankind would be no more justified in silencing that one
person, than he, if he had the power, would be justified in silencing
mankind -- John Stuart Mill


Re: OT Boincing Spam

2004-12-20 Thread Evan Platt
ChupaCabra said:
 First he wanted that.  I did it but actually kept em all.  So then his
 partner didn't get an urgent email so it was turned back to the users to
 decide.  I get a different kneejerk each week.  What fun dealing with an
 80 yo ex military man.  This am it was Lets spambomb every isp that
 sends spam  and maybe *they*  will do something about it.  And screw the
 rest of the world too.  America owns the internet.  Fsck Em, they would
 all die without the american economy, etc.

Perhaps he doesn't understand the thinking behind let's spambomb every
ISP that sends spam

Back to my second Joe-Job example.
For example, let's say I'm connected with a dial up account in China. I
spoof all headers to indicate my spam comes from [EMAIL PROTECTED] . Who
gets the bounce messages, ChinaSpamHaven.hk , or [EMAIL PROTECTED] / aol.com
?


Re: Bayes question

2004-12-20 Thread Chuck Campbell
On Mon, Dec 20, 2004 at 12:56:43PM -0600, Steve Bondy wrote:
 Just because you learn something as spam doesn't mean it will be
 blocked.
 SA will add a score to the message based on the bayes rules, but if the
 bayes rules are the only ones that get hit, and they score less than
 your threshold, it won't keep the stuff out.
 For example, the default score in 2.6.x for BAYES_90 is either 2.454 or
 2.101.  If that's the only rule you hit, and your threshold is above
 those numbers, it will come through.
 

But what if you repeatedly learn the message(s) in question as spam?  
Shouldn't bayes start to give it higher scores?  If it becomes a near perfect 
match, it should get a bayes_99, right?

-chuck




RE: Bayes question

2004-12-20 Thread Steve Bondy
I'm no expert on Bayes, but as far as I know, repeatedly learning the
same message over and over again doesn't do you any good.  Once the
tokens are in there, that's it.  The bayes score goes up as more tokens
in the message match 

Someone please correct me if I'm wrong, and confirm if I'm right... It
would help me out too.

Steve

 -Original Message-
 From: Chuck Campbell [mailto:[EMAIL PROTECTED] 
 Sent: Monday, December 20, 2004 3:54 PM
 To: Steve Bondy
 Cc: SpamAssassin Users
 Subject: Re: Bayes question
 
 
 On Mon, Dec 20, 2004 at 12:56:43PM -0600, Steve Bondy wrote:
  Just because you learn something as spam doesn't mean it will be 
  blocked. SA will add a score to the message based on the 
 bayes rules, 
  but if the bayes rules are the only ones that get hit, and 
 they score 
  less than your threshold, it won't keep the stuff out.
  For example, the default score in 2.6.x for BAYES_90 is 
 either 2.454 or
  2.101.  If that's the only rule you hit, and your threshold is above
  those numbers, it will come through.
  
 
 But what if you repeatedly learn the message(s) in question as spam?  
 Shouldn't bayes start to give it higher scores?  If it 
 becomes a near perfect 
 match, it should get a bayes_99, right?
 
 -chuck
 
 
 


Re: Bayes question

2004-12-20 Thread Chuck Campbell
On Mon, Dec 20, 2004 at 04:13:44PM -0600, Steve Bondy wrote:
 I'm no expert on Bayes, but as far as I know, repeatedly learning the
 same message over and over again doesn't do you any good.  Once the
 tokens are in there, that's it.  The bayes score goes up as more tokens
 in the message match 

It's not the same message... exactly.  It is the same spam, coming from many
different senders, each with a unique message ID.  I keep getting more of them,
and I keep learning them with sa-learn.

I'm just not getting SA to notice them as spam.

-chuck



Re: Bayes question

2004-12-20 Thread Michael Parker
On Mon, Dec 20, 2004 at 04:18:58PM -0600, Chuck Campbell wrote:
 It's not the same message... exactly.  It is the same spam, coming from many
 different senders, each with a unique message ID.  I keep getting more of 
 them,
 and I keep learning them with sa-learn.
 
 I'm just not getting SA to notice them as spam.
 

What rules are hitting?  Is BAYES_99 one of them?

Michael


pgpMsoBR1DAtl.pgp
Description: PGP signature


RE: Bayes question

2004-12-20 Thread Steve Bondy

 
 On Mon, Dec 20, 2004 at 04:13:44PM -0600, Steve Bondy wrote:
  I'm no expert on Bayes, but as far as I know, repeatedly 
 learning the 
  same message over and over again doesn't do you any good.  Once the 
  tokens are in there, that's it.  The bayes score goes up as more 
  tokens in the message match
 
 It's not the same message... exactly.  It is the same spam, 
 coming from many different senders, each with a unique 
 message ID.  I keep getting more of them, and I keep learning 
 them with sa-learn.
 
 I'm just not getting SA to notice them as spam.
 
 -chuck
 
 

So the message content is the same, but coming from different sources?


SPF tests fail on 3.02?

2004-12-20 Thread Henry Kwan
Hi.

Am trying to upgrade to 3.02 from 3.01 (RH FC1 with sendmail/spamd/procmail) but
on 'make test', I get these following errors.

t/spf...Not found: helo_pass =  SPF_HELO_PASS 
# Failed test 1 in t/SATest.pm at line 530
Not found: pass =  SPF_PASS 
# Failed test 2 in t/SATest.pm at line 530 fail #2
t/spf...FAILED tests 1-2 
Failed 2/2 tests, 0.00% okay

I checked and v3.01's spf test passed and I don't think I changed anything so
what is 3.02 looking for that's new?

Thanks.

--Henry Kwan




Re: SPF tests fail on 3.02?

2004-12-20 Thread Theo Van Dinter
On Mon, Dec 20, 2004 at 10:46:16PM +, Henry Kwan wrote:
 t/spf...Not found: helo_pass =  SPF_HELO_PASS 
 I checked and v3.01's spf test passed and I don't think I changed anything so
 what is 3.02 looking for that's new?

Known issue:

http://bugzilla.spamassassin.org/show_bug.cgi?id=4044

-- 
Randomly Generated Tagline:
Personally, Rorschach blots always look like butterflies to me.  Or
 pelvis bones, I admit it.
  -- Larry Wall, 8th State of the Onion


pgpsDKMlBi6HB.pgp
Description: PGP signature


Re: SPF tests fail on 3.02?

2004-12-20 Thread Theo Van Dinter
On Mon, Dec 20, 2004 at 11:09:51PM +, Henry Kwan wrote:
 So if the test didn't actually fail, should I go ahead with the install or
 wait for v3.03?

As long as everything else passes, just install 3.0.2.

-- 
Randomly Generated Tagline:
I'm happy. I'm giddy. I'm spiffy. - Michael Kearney


pgpgV5FvoKTVA.pgp
Description: PGP signature


Re: trying to install 3.0.2 via CPAN

2004-12-20 Thread Robert Menschel
Hello alan,

Saturday, December 18, 2004, 7:15:46 PM, you wrote:

ap for some reason i'm getting SPF failures during the 'make test'
ap phase: ...

I found a month or so ago, during a system rebuild, that for some
reason I was getting errors like this for 3.0.1, from a CPAN install,
but I then did a download of the tar and installed from that, and
make test came out clean.

You might try something similar -- use CPAN to make sure your
dependencies are all in place (especially the SPF prereqs), and then
install (at least through the make test from a tarball, and see if
that gets around the problem.

Bob Menschel