Re: BAYES_99 makes lots of false-positive

2006-07-13 Thread Joshua, C.S. Chen




Matt Kettler wrote:

  In sa 2.6x or older, yes.. in sa 3.0.0 or higher, no.

First, phrases isn't quite accurate.. bayes stores tokens, and most of
the tokens are simply words, not phrases.

In SA 3.0.0 or higher the text tokens themselves are not stored, only
the SHA1 hash of them is stored. This cannot be easily reversed to
figure out what the text token was, but it's easy to figure out the hash
of another token and compare the two. Thus, it's impossible for dump to
display the text tokens, it doesn't know what they are.

The main reason to do this in SA 3.x is performance. All the SHA hashes
are the same size. No more variable-length string compares, just
straight fixed-width binary compares. Ditto for record reads. A side
effect is increased security.. nobody can look at your bayes DB and make
assumptions about what your email conversations talk about.

  



Thanks Matt, for the details.



  If you want to see the text tokens that match bayes for a particular
message, you can do this by feeding a message to spamassassin in bayes
debug mode..

spamassassin -D bayes=255 <
  




  some key phrases, words
in the spam mails? If so, can I see some chinese phrases?
  
  

I've never tried, but the above should work for Chinese text, provided
your local terminal supports it.

  
  message.txt

That should let you know which tokens in the message are matching bayes,
and what  each gets (from 0. to 1., which represents
0% to 100%).

Word of advice: if you see a LOT of innocuous words matching in the
range of 0.90-1.0 you can worry. But do not worry about every single
word that seems "wrong". A typical message will match a dozen or more
tokens.

All that said, how do you fix it? Feed your problem messages to sa-learn
--ham. If it's really bad, wipe your bayes DB and start over.

  



It sounds great to be able to see which tokens mach those in the bayes
db.
I tried a test message with -D bayes=255 like




$ spamassassin -D bayes=255 < /tmp/message
>From [EMAIL PROTECTED]  Fri Jul 14 10:32:01 2006
Return-Path: <[EMAIL PROTECTED]>
X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on
asiaa.sinica.edu.tw
X-Spam-Level:
X-Spam-Status: No, score=-102.2 required=6.0 tests=ALL_TRUSTED,AWL,
    FROM_IAA_LOCAL_SITE1,USER_IN_WHITELIST autolearn=no
version=3.1.0
Received: from [140.109.177.202] (genesis.asiaa.sinica.edu.tw
[140.109.177.202])
    by asiaa.sinica.edu.tw (8.13.1/8.13.1) with ESMTP id
k6E2VqVw011774
    for <[EMAIL PROTECTED]>; Fri, 14 Jul 2006
10:31:52 +0800
Message-ID: <[EMAIL PROTECTED]>
Date: Fri, 14 Jul 2006 10:31:52 +0800
From: "Joshua, C.S. Chen" <[EMAIL PROTECTED]>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.13)
Gecko/20060418 Red Hat/1.7.13-1.4.1
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: =?Big5?B?rEyswA==?= <[EMAIL PROTECTED]>
Subject: test for spamassassin -D bayes=255
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Virus-Scanned: by amavisd-new
X-Keywords:
X-UID: 9719
Status: O
Content-Length: 88
Lines: 4

This is a test. How I want to see the tokens' details that bayes thinks.

Cheers
Joshua







It just showed the original message, not the tokens and probabilities.
Am I missing something here?


Thanks very much

Cheers
Joshua




Re: Network tests slowing down spamassassin

2006-07-13 Thread Daryl C. W. O'Shea

On 7/13/2006 11:06 AM, Ramprasad wrote:


  So what is the best way to reduce network traffic. We are already
getting the sbl-xbl lists from spamhaus so as to serve those lists
locally , can I get any other lists locally ?  Commercial agreements
also are ok.


Many/most lists will provide rsync access to sites where it is efficient 
to do so.  For most lists, they'll provide access to anyone who 
processes over 100,000 messages a day.


Visit the websites of the lists in question for details and contact 
info.  Some of the lists have rsync access application forms to fill out 
and send in available on their sites.



Daryl


Re: Network tests slowing down spamassassin

2006-07-13 Thread Ramprasad
On Thu, 2006-07-13 at 11:17 -0400, Craig Morrison wrote:
> Ramprasad wrote:
> > Hi,
> >   SA works fine , for the quiet large setup that we have. ( we get upto
> > 200k mails an hour at peak times ) 
> >   But I notice it is too network dependent. A little problem with the
> > network and all hell breaks loose. Mailq shoots up and SA starts timing
> > out. 
> >  Probably because I have enabled all kinds of BL tests and uri checks.
> > But these checks are indispensable without these SA would have no teeth
> > at all.
> >   
> >   So what is the best way to reduce network traffic. We are already
> > getting the sbl-xbl lists from spamhaus so as to serve those lists
> > locally , can I get any other lists locally ?  Commercial agreements
> > also are ok.
> > 
> 
> Are you running a local caching nameserver?

Yes of course. Sorry not to have mentioned that.
We use djbdns dnscache on some servers and bind on the others.
But caching does not solve all problems 

Thanks
Ram





RE: The best way to use Spamassassin is to not use Spamassassin

2006-07-13 Thread John D. Hardin
On Thu, 13 Jul 2006, Michael Scheidell wrote:

> From: John D. Hardin [mailto:[EMAIL PROTECTED] 
> 
> > > doesn't work the skip it and move on. I get rid of 120,000 
> > > spams a day 
> > > using that trick.
> > 
> > Ooo. Set it to maila.microsoft.com... {evil grin}
> 
> Not a good idea, since maila.microsoft.com doesn't have your userlist
> in, normal internet delay WILL push some email to it, and if you do,
> normal email will bounce with 55x unknown user, or unable to relay.

...ewww! His leg came right off. *pop*.

Now what do I do with it?

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  Apparently the Bush/Rove idea of being a "fiscal conservative" is
  to spend money like there's no tomorrow, run up huge deficits, and
  pray the Rapture happens before the bills come due.
   -- atul666 in Y! SCOX forum
---
 11 days until The 37th anniversary of Apollo 11 landing on the Moon



Re: Image only spam

2006-07-13 Thread Steven Stern
Jack Gostl wrote:
> 
> - Original Message - From: "Steven Stern"
> <[EMAIL PROTECTED]>
> To: "Spamass" 
> Sent: Wednesday, July 12, 2006 4:31 PM
> Subject: Re: Image only spam
> 
> 
>> Jack Gostl wrote:
>>> Thanks for the response.
>>>
>>> Take it slow with me, spamassassin has been running so well for so
>>> long that I haven't had to fiddle with it in ages and I don't
>>> remember the details. Do I add these rules to my user_prefs? Or to my
>>> /etc/mail/local.cf files?
>>>
>>> - Original Message - From: "Steven Stern"
>>> <[EMAIL PROTECTED]>
>>> To: "Spamass" 
>>> Sent: Wednesday, July 12, 2006 9:13 AM
>>> Subject: Re: Image only spam
>>>
>>>
 Jack Gostl wrote:
> I'm running SpamAssassin version 3.0.3   running on Perl version 5.8.2
> under AIX 5.3. Starting a few months ago, I have been absolutely
> inundated with "image only spam".  I've gone from catching 99% of the
> spam with almost no false positives to less than 85%. I asked about
> this
> awhile ago, and tried to upgrade to SpamAssassin version 3.1.1 running
> on Perl version 5.8.0, and didn't see much improvement, so I left the
> prod machine alone.
>
> I'm sure I'm not the only one with this problem. Has anyone had any
> success with it?
>
> Thanks...
>
> Jack
>

 Are you using the SARE_STOCK rules from RulesDuJour at
 rulesemporium.com?  We catch more than 99% of the image only stuff with
 the standard RBLs and 70_sare_stock.cf.

 In case  you ask, these are the SARE rules we're using:

 TRUSTED_RULESETS="SARE_GENLSUBJ0 SARE_OBFU SARE_REDIRECT_POST300
 SARE_ADULT SARE_HEADER0 SARE_CODING SARE_SPECIFIC SARE_SPOOF SARE_FRAUD
 SARE_WHITELIST_SPF SARE_WHITELIST_RCVD SARE_URI0 SARE_OEM SARE_STOCKS";

 -- 

  Steve

>> Hop over to the Rules Emporium (http://rulesemporium.com) and read
>> about RulesDuJour.  Install that and set up cron job to look for
>> updates once a day.  That's about it.  It's about 30 minutes of think
>> work up front to understand the documentation and install it. After
>> that, set it and forget it.
>>
>> http://www.exit0.us/index.php?pagename=RulesDuJour
>>
>> I think you'll be happy with the trusted ruleset line above.
> 
> wanted to tell you how this all turned out.
> 
> I installed the new rules, incorrectly as Dimitri observed, and then
> restarted spamassassin. (spamd actually). The spam capture rate has
> zoomed from 85% into the high 90s. Looking back I see that we replaced
> our processor about a year ago, and have been exceptionally stable since
> then. We haven't IPLed in almost a year, which also means that
> spamassassin probably hasn't been started in almost as long.
> 
> Obviously the new rules weren't the reason for the improvement, since
> they were installed wrong. So it must have been the restart. This makes
> me wonder, was it a "corruption", or is there a cumulative effect. I
> wonder if anyone has any thoughts on that.
> 
> 

I have a cron job scheduled for every Sunday

  sa-update && spamassassin --lint && /etc/init.d/spamassassin restart

This will pick up updates to the basic SA rules if they update them.

-- 

  Steve


RE: The best way to use Spamassassin is to not use Spamassassin

2006-07-13 Thread Michael Scheidell

> -Original Message-
> From: John D. Hardin [mailto:[EMAIL PROTECTED] 
> Sent: Thursday, July 13, 2006 9:52 AM
> To: Marc Perkel
> Cc: Bart Schaefer; users@spamassassin.apache.org
> Subject: Re: The best way to use Spamassassin is to not use 
> Spamassassin
> 
> 
> > doesn't work the skip it and move on. I get rid of 120,000 
> spams a day 
> > using that trick.
> 
> Ooo. Set it to maila.microsoft.com... {evil grin}

Not a good idea, since maila.microsoft.com doesn't have your userlist
in, normal internet delay WILL push some email to it, and if you do,
normal email will bounce with 55x unknown user, or unable to relay.

On a second note, I have secondary server, doing a lot of great header
checks, RFC checks, and SPF checks, and with that, I can prove that it
does a better job on blocking spam than SA, with 90% of the resources.

(however, it also blocks an unacceptable amount of REAL email!)

SA does a reasonable job of the balance between blocking spam and
allowing real email through.


Re: Spamassassin for web input forms ?

2006-07-13 Thread Kelson

Loren Wilton wrote:
If this web form isn't high volume, you could format the form input as a 
mail message and pipe it to spamassassin, then check the result.


Also, if the web form is written in Perl, you could access the 
SpamAssassin Perl modules directly.


--
Kelson Vibber
SpeedGate Communications 


Spamassassin for web input forms ?

2006-07-13 Thread Milan 'Koudis' Koudelka

Hi,
i have an idea to use spamassassin for checking web inputs forms. But i 
wanna different spam database from mail spam database on the same 
machine. I think that i can send web form to some e-mail address, if 
spamassassin say thats ok, send it to another e-mail address and theris 
some maildrop script, that put it to database. Is this possible, or i 
thinking bad ? Do you known some articles or examples about this ?


Thx.


Re: Problems on rethad 9.0

2006-07-13 Thread Kelson

Benny Pedersen wrote:

the same was a fault from redhat to say we don't like to support older redhat
relases, since we now need to donwload a iso file again to upgrade what could
have being done from up2date ?


You might find the following useful:

HOWTO: yum upgrade to CentOS 4.0
http://www.centos.org/modules/newbb/viewtopic.php?topic_id=654&forum=27

CentOS 4 is a clone of Red Hat Enterprise 4, and as such is as current 
and will remain supported as long as RHEL 4.  (Basically they grab the 
source RPMs, which are available per GPL and similar licenses, strip out 
anything they aren't allowed to redistribute, and rename it.)


You can grab a copy of yum and the appropriate config, then upgrade the 
system live.


I've upgraded two systems this way from Red Hat 9 to CentOS 3 (which is 
based on RHEL 3).  Version 3 is a bit older, but it's still supported 
and will continue to be supported for several years, unlike Red Hat 9, 
which lost official support two years ago and will likely lose the 
unofficial support from Fedora Legacy within the next 6 months to a year.


--
Kelson Vibber
SpeedGate Communications 


SPAMASSASSIN PER-USER AND QMAIL TOASTER

2006-07-13 Thread Marcos




It is possible to configure preferences to per-user using 
mySQL and the Spamassassin, using the Qmail Toaster ?
I am trying to make this, but I below receive the message from 
error:
spamd: service unavailable: Error fetching to user 
preferences saw SQL at /usr/bin/spamd line 1684, line 2. 
Somebody could help me ?
Thank's
Marcos[EMAIL PROTECTED]
 


Re: Spamassassin for web input forms ?

2006-07-13 Thread Loren Wilton

i have an idea to use spamassassin for checking web inputs forms. But i


Ok.


wanna different spam database from mail spam database on the same machine.


Ok.

I think that i can send web form to some e-mail address, if spamassassin 
say thats ok, send it to another e-mail address and theris some maildrop 
script, that put it to database. Is this possible, or i thinking bad ? Do 
you known some articles or examples about this ?


I think this is probably too complicated.

I am assuming you have some script that is attached to the web form that 
sends the mail.  This script must already be making a mail header with From, 
To, and Subject, at the least.  If it isn't doing this, it probably has the 
information that it can.


If this web form isn't high volume, you could format the form input as a 
mail message and pipe it to spamassassin, then check the result.  I believe 
that you can do the same thing using spamc/spamd, but there is a recent 
problem with the return value being incorrect in some cases.  If the result 
indicates that it is spam you could either complain to the form sender, or 
quietly throw the message away.  Probably best to complain to the sender in 
case it really wasn't supposed to be spam; a spambot will probably ignore 
the complaint anyway.


Now, using a different database.  Depends on what you mean by 'database'. 
Do you mean a different set of rules?  Do you mean a different Bayes 
database?  Do you mean something else?


If you mean different rules, there are options to spamassassin that will 
tell it to look in other than the normal places for the rules files.  You 
could use this when you invoke SA.  I don't know if there is something 
similar for Bayes, but it wouldn't surprise me.


   Loren



dnsbl debugging

2006-07-13 Thread Ben Wylie

I am running SpamAssassin 3.1.2 on Windows 2003.

I have had a problem with dnsbl lookups, as even though they complete, 
they do not return any results.


It appears that only when i get a dns timeout that any that did complete 
and were positive are actually returned and score the email.
I wrote a longer email with log excerpts etc but this list kept on 
rejecting it (Is there any way to get around that? It is so frustrating 
especially as it doesn't tell you what is causing it to have a high score).

It is here:
http://www.arkbb.co.uk/logs/2006-07-13-SpamAssassinQ.txt

Is there a way to turn on extra DNSBL logging, as it doesn't tell you 
very much?


Thanks
Ben



Re: Abuse of SARE whitelist

2006-07-13 Thread Theo Van Dinter
On Thu, Jul 13, 2006 at 11:21:09AM +0100, Justin Mason wrote:
> ok -- there's the bug ;)  SpamAssassin is misinterpreting your
> MX's Received headers.
[...]
> Could you open a bug on the SpamAssassin bugzilla about that?  Attach
> the debug output and sample again (it can be tricky to find posts to
> the mailing list when fixing the bug later otherwise).  cheers,

For anyone else following this, the bug was already fixed in 3.1.2 via
bug 4813.

-- 
Randomly Generated Tagline:
"Is blue supposed to be soothing when I lose my data?"  - Dave DeMaagd


pgp05Pk49IbVh.pgp
Description: PGP signature


Re: Network tests slowing down spamassassin

2006-07-13 Thread Craig Morrison

Ramprasad wrote:

Hi,
  SA works fine , for the quiet large setup that we have. ( we get upto
200k mails an hour at peak times ) 
  But I notice it is too network dependent. A little problem with the

network and all hell breaks loose. Mailq shoots up and SA starts timing
out. 
 Probably because I have enabled all kinds of BL tests and uri checks.

But these checks are indispensable without these SA would have no teeth
at all.
  
  So what is the best way to reduce network traffic. We are already

getting the sbl-xbl lists from spamhaus so as to serve those lists
locally , can I get any other lists locally ?  Commercial agreements
also are ok.



Are you running a local caching nameserver?

For my group that seems to help a great deal.

--
Craig


Re: Network tests slowing down spamassassin

2006-07-13 Thread Rick Macdougall

Ramprasad wrote:

Hi,
  SA works fine , for the quiet large setup that we have. ( we get upto
200k mails an hour at peak times ) 
  But I notice it is too network dependent. A little problem with the

network and all hell breaks loose. Mailq shoots up and SA starts timing
out. 
 Probably because I have enabled all kinds of BL tests and uri checks.

But these checks are indispensable without these SA would have no teeth
at all.
  
  So what is the best way to reduce network traffic. We are already

getting the sbl-xbl lists from spamhaus so as to serve those lists
locally , can I get any other lists locally ?  Commercial agreements
also are ok.



We have a similar setup and we don't have any problems.  We do run a 
local DNS cache server on each SA server though (read, not Bind).  Are 
you running a cache server on each SA server ?


We get about 1 million cached request hits an hour and 32K external 
requests in the same time frame.


Regards,

Rick



Network tests slowing down spamassassin

2006-07-13 Thread Ramprasad
Hi,
  SA works fine , for the quiet large setup that we have. ( we get upto
200k mails an hour at peak times ) 
  But I notice it is too network dependent. A little problem with the
network and all hell breaks loose. Mailq shoots up and SA starts timing
out. 
 Probably because I have enabled all kinds of BL tests and uri checks.
But these checks are indispensable without these SA would have no teeth
at all.
  
  So what is the best way to reduce network traffic. We are already
getting the sbl-xbl lists from spamhaus so as to serve those lists
locally , can I get any other lists locally ?  Commercial agreements
also are ok.


Thanks
Ram




   



Re: Image only spam

2006-07-13 Thread Dimitri Yioulos
On Thursday July 13 2006 9:28 am, Jack Gostl wrote:
> - Original Message -
> From: "Steven Stern" <[EMAIL PROTECTED]>
> To: "Spamass" 
> Sent: Wednesday, July 12, 2006 4:31 PM
> Subject: Re: Image only spam
>
> > Jack Gostl wrote:
> >> Thanks for the response.
> >>
> >> Take it slow with me, spamassassin has been running so well for
> >> so long that I haven't had to fiddle with it in ages and I don't
> >> remember the details. Do I add these rules to my user_prefs? Or
> >> to my /etc/mail/local.cf files?
> >>
> >> - Original Message - From: "Steven Stern"
> >> <[EMAIL PROTECTED]>
> >> To: "Spamass" 
> >> Sent: Wednesday, July 12, 2006 9:13 AM
> >> Subject: Re: Image only spam
> >>
> >>> Jack Gostl wrote:
>  I'm running SpamAssassin version 3.0.3   running on Perl
>  version 5.8.2 under AIX 5.3. Starting a few months ago, I have
>  been absolutely inundated with "image only spam".  I've gone
>  from catching 99% of the spam with almost no false positives
>  to less than 85%. I asked about this
>  awhile ago, and tried to upgrade to SpamAssassin version 3.1.1
>  running
>  on Perl version 5.8.0, and didn't see much improvement, so I
>  left the prod machine alone.
> 
>  I'm sure I'm not the only one with this problem. Has anyone
>  had any success with it?
> 
>  Thanks...
> 
>  Jack
> >>>
> >>> Are you using the SARE_STOCK rules from RulesDuJour at
> >>> rulesemporium.com?  We catch more than 99% of the image only
> >>> stuff with the standard RBLs and 70_sare_stock.cf.
> >>>
> >>> In case  you ask, these are the SARE rules we're using:
> >>>
> >>> TRUSTED_RULESETS="SARE_GENLSUBJ0 SARE_OBFU
> >>> SARE_REDIRECT_POST300 SARE_ADULT SARE_HEADER0 SARE_CODING
> >>> SARE_SPECIFIC SARE_SPOOF SARE_FRAUD SARE_WHITELIST_SPF
> >>> SARE_WHITELIST_RCVD SARE_URI0 SARE_OEM SARE_STOCKS";
> >>>
> >>> --
> >>>
> >>>  Steve
> >
> > Hop over to the Rules Emporium (http://rulesemporium.com) and
> > read about RulesDuJour.  Install that and set up cron job to look
> > for updates once a day.  That's about it.  It's about 30 minutes
> > of think work up front to understand the documentation and
> > install it. After that, set it and forget it.
> >
> > http://www.exit0.us/index.php?pagename=RulesDuJour
> >
> > I think you'll be happy with the trusted ruleset line above.
>
>  wanted to tell you how this all turned out.
>
> I installed the new rules, incorrectly as Dimitri observed, and
> then restarted spamassassin. (spamd actually). The spam capture
> rate has zoomed from 85% into the high 90s. Looking back I see that
> we replaced our processor about a year ago, and have been
> exceptionally stable since then. We haven't IPLed in almost a year,
> which also means that spamassassin probably hasn't been started in
> almost as long.
>
> Obviously the new rules weren't the reason for the improvement,
> since they were installed wrong. So it must have been the restart.
> This makes me wonder, was it a "corruption", or is there a
> cumulative effect. I wonder if anyone has any thoughts on that.

It appears that you were using only the SA default rules.  Now, these 
are pretty good, but I think most would agree that you want to 
supplement these with SARE rulesets, and prehaps bayes, DCC, razor, 
and pyzor (or some combination thereof).  Then, you've got a pretty 
tight system.

Dimitri

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



Re: SA RPMs for Suse 10.0?

2006-07-13 Thread Mick Pollard

Just d'l the source tar ball and build rpm's from it.
/*rpmbuild -tb Mail-SpamAssassin-3.1.3.tar.gz*
/That should produce a binary RPM for you. I have not tested it yet.


Regards
Mick Pollard
__lunix-aus__/
/
Paul Hutchings wrote:

I have a Suse 10.0 system running Spamassassin from the Suse RPMs that
used to be on Carsten Hoegers Suse FTP folder.

They don't appear to be there any more, so I wondered if anyone had any
pointers to compatible RPMs to get me up to 3.1.3?

I know I could uninstall the existing Spamassassin RPMs and grab it from
CPAN, I don't know if that might cause more problems than a simple RPM
upgrade.

cheers,
Paul
--
Paul Hutchings
Network Administrator, MIRA Ltd.
Tel: 44 (0)24 7635 5378, Fax: 44 (0)24 7635 8378
mailto:[EMAIL PROTECTED]
  


Re: $from and $to

2006-07-13 Thread Matt Kettler
Pezhman Lali wrote:
> Hi
> which variables in spamd, contained "from" and "to"  of the processed
> mail?
> if nothing,
> how can I add this variables?

Which "from" and "to" are you referring to? The ones in the message
body, or the actual ones from the envelope (ie: RCPT TO: and MAIL FROM
commands)?

If the envelope, SA doesn't have them. It might be able to guess from
the headers, but it doesn't know for sure.

Next, I'm assuming that since you're asking about variables, you're
modifying the code. Take a look around for the "all_from_addrs" and
"all_to_addrs" subroutines in the evaltests section. That should get you
a list of all the from/to addrs SA was able to deduce from the headers.

However, DO NOT under ANY condition, attempt to perform message delivery
based on the above information. You will not get the desired results for
any message that has been remailed (ie: a mailing list).



Re: BAYES_99 makes lots of false-positive

2006-07-13 Thread Matt Kettler
Joshua, C.S. Chen wrote:
> Hello folks,
> My users speak Chinese. I found that spamassassin seems not working well
> about chinese chset (utf8 or big5) on the bayes issue. Many normal mails
> (almost) get BAYES_99 score although the real spam also get BAYES_99. It
> looks like foreign language like Chinese is very easy to be high bayes
> scored.
> I have setup ok_locales all but it doesn't help the false-positive problem.
>
> And another question: just wonder what if I do sa-learn --dump? Am I
> supposed to see the phrase that SA has learned? 
In sa 2.6x or older, yes.. in sa 3.0.0 or higher, no.

First, phrases isn't quite accurate.. bayes stores tokens, and most of
the tokens are simply words, not phrases.

In SA 3.0.0 or higher the text tokens themselves are not stored, only
the SHA1 hash of them is stored. This cannot be easily reversed to
figure out what the text token was, but it's easy to figure out the hash
of another token and compare the two. Thus, it's impossible for dump to
display the text tokens, it doesn't know what they are.

The main reason to do this in SA 3.x is performance. All the SHA hashes
are the same size. No more variable-length string compares, just
straight fixed-width binary compares. Ditto for record reads. A side
effect is increased security.. nobody can look at your bayes DB and make
assumptions about what your email conversations talk about.

If you want to see the text tokens that match bayes for a particular
message, you can do this by feeding a message to spamassassin in bayes
debug mode..

spamassassin -D bayes=255  some key phrases, words
> in the spam mails? If so, can I see some chinese phrases?
>   
I've never tried, but the above should work for Chinese text, provided
your local terminal supports it.



Re: The best way to use Spamassassin is to not use Spamassassin

2006-07-13 Thread John D. Hardin
On Wed, 12 Jul 2006, Marc Perkel wrote:

> Depends on what he's doing it might work. I catch most spam based on 
> sender behavior rather than message content. For example, anyone can do 
> this trick. Set your highest MX record (add a new one) to an IP address 
> that doesn't exist. Some spammers spam the highest MX first and it that 
> doesn't work the skip it and move on. I get rid of 120,000 spams a day 
> using that trick.

Ooo. Set it to maila.microsoft.com... {evil grin}

--
 John Hardin KA7OHZICQ#15735746http://www.impsec.org/~jhardin/
 [EMAIL PROTECTED]FALaholic #11174pgpk -a [EMAIL PROTECTED]
 key: 0xB8732E79 - 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
 11 days until The 37th anniversary of Apollo 11 landing on the Moon



Re: Image only spam

2006-07-13 Thread Jack Gostl


- Original Message - 
From: "Steven Stern" <[EMAIL PROTECTED]>

To: "Spamass" 
Sent: Wednesday, July 12, 2006 4:31 PM
Subject: Re: Image only spam



Jack Gostl wrote:

Thanks for the response.

Take it slow with me, spamassassin has been running so well for so long 
that I haven't had to fiddle with it in ages and I don't remember the 
details. Do I add these rules to my user_prefs? Or to my 
/etc/mail/local.cf files?


- Original Message - From: "Steven Stern" 
<[EMAIL PROTECTED]>

To: "Spamass" 
Sent: Wednesday, July 12, 2006 9:13 AM
Subject: Re: Image only spam



Jack Gostl wrote:

I'm running SpamAssassin version 3.0.3   running on Perl version 5.8.2
under AIX 5.3. Starting a few months ago, I have been absolutely
inundated with "image only spam".  I've gone from catching 99% of the
spam with almost no false positives to less than 85%. I asked about 
this
awhile ago, and tried to upgrade to SpamAssassin version 3.1.1 
running

on Perl version 5.8.0, and didn't see much improvement, so I left the
prod machine alone.

I'm sure I'm not the only one with this problem. Has anyone had any
success with it?

Thanks...

Jack



Are you using the SARE_STOCK rules from RulesDuJour at
rulesemporium.com?  We catch more than 99% of the image only stuff with
the standard RBLs and 70_sare_stock.cf.

In case  you ask, these are the SARE rules we're using:

TRUSTED_RULESETS="SARE_GENLSUBJ0 SARE_OBFU SARE_REDIRECT_POST300
SARE_ADULT SARE_HEADER0 SARE_CODING SARE_SPECIFIC SARE_SPOOF SARE_FRAUD
SARE_WHITELIST_SPF SARE_WHITELIST_RCVD SARE_URI0 SARE_OEM SARE_STOCKS";

--

 Steve

Hop over to the Rules Emporium (http://rulesemporium.com) and read about 
RulesDuJour.  Install that and set up cron job to look for updates once a 
day.  That's about it.  It's about 30 minutes of think work up front to 
understand the documentation and install it. After that, set it and forget 
it.


http://www.exit0.us/index.php?pagename=RulesDuJour

I think you'll be happy with the trusted ruleset line above.


wanted to tell you how this all turned out.

I installed the new rules, incorrectly as Dimitri observed, and then 
restarted spamassassin. (spamd actually). The spam capture rate has zoomed 
from 85% into the high 90s. Looking back I see that we replaced our 
processor about a year ago, and have been exceptionally stable since then. 
We haven't IPLed in almost a year, which also means that spamassassin 
probably hasn't been started in almost as long.


Obviously the new rules weren't the reason for the improvement, since they 
were installed wrong. So it must have been the restart. This makes me 
wonder, was it a "corruption", or is there a cumulative effect. I wonder if 
anyone has any thoughts on that.





Spamassassin for web input forms ?

2006-07-13 Thread Milan 'Koudis' Koudelka

Hi,
i have an idea to use spamassassin for checking web inputs forms. But i 
wanna different spam database from mail spam database on the same 
machine. I think that i can send web form to some e-mail address, if 
spamassassin say thats ok, send it to another e-mail address and theris 
some maildrop script, that put it to database. Is this possible, or i 
thinking bad ? Do you known some articles or examples about this ?


Thx.


Re: Problems on rethad 9.0

2006-07-13 Thread Benny Pedersen
On Wed, July 12, 2006 11:53, Tom Brown wrote:

>> Im working in rethad 9.0 and try to install spamassassin in a good way
> why redhat 9?

why redhat at all ? :-)

>> How can i fix this problem?
> use a more upto date OS ??

70 million people still use windows 98, now microsoft stops supporting
secuirity updates to it, this might get 70 millon people to join the fun of
windows xp ?

if windows 98 still does the job for 70 million people why change then ?

the same was a fault from redhat to say we don't like to support older redhat
relases, since we now need to donwload a iso file again to upgrade what could
have being done from up2date ?

think about it :-)))




-- 
Benny



$from and $to

2006-07-13 Thread Pezhman Lali
Hi  which variables in spamd, contained "from" and "to"  of the processed mail?  if nothing,   how can I add this variables?     Best  Pezhman 
		How low will we go? Check out Yahoo! Messenger’s low  PC-to-Phone call rates.

modifying log

2006-07-13 Thread Pezhman Lali
hi  Where is the module of logging in spamd?  I want to add $from and $to to logging?  which file, and which part must be changed?     Best  pezhman Lali    
		Talk is cheap. Use Yahoo! Messenger to make PC-to-Phone calls.  Great rates starting at 1¢/min.

Re: mangled uris

2006-07-13 Thread JamesDR

Ramprasad wrote:

Spamassassin works pretty great for me, but some spammers keep
upgrading. Some of my clients are still getting stupid spams thru

I think this was discussed before how do I catch spam with mangled urls.
Sorry if this is a repeat 

Something like 


--
visit 
http://somespammmersite. com  ... delet the space befre the com

-

I dont know if the spammer will ever get any customer to really "delet"
the space and go to the url he intends. 


 I dont understand the business sense behind this spam. Its a lose -
lose game. The spammer never gets anyone to click,( who would click a
broken url and fix it and click again )  the site owner never gets hits,
the spam filter guy gets more headaches and the end user has to delete
one more mail.



Thanks
Ram




I think it has more to do with them knowing their current efforts are in 
vain. So now it has come down to some rather odd tricks. I've seen a few 
that say webaddress and instruct the 'reader' to add http://www to the 
beginning and .dom to the ending. This to me seems fruitless, but it 
must be working on some group of people because I still see a few mails 
with this technique a day. It goes back to what users will do, and what 
they won't. Seems some will do what the spamer wants :-D


--
Thanks,
JamesDR


smime.p7s
Description: S/MIME Cryptographic Signature


Re: Abuse of SARE whitelist

2006-07-13 Thread Justin Mason

ok -- there's the bug ;)  SpamAssassin is misinterpreting your
MX's Received headers.

Received: from vm.vonage.com ([220.166.39.177]) by amsfep14-int.chello.nl
  (InterMail vM.6.01.04.04 201-2131-118-104-20050224) with SMTP
  id <[EMAIL PROTECTED]>
  for <[EMAIL PROTECTED]>; Thu, 13 Jul 2006 08:33:46 +0200

This actually means "received from IP 220.166.39.177, which had no
rDNS, which HELO'd as vm.vonage.com".  SpamAssassin parses it as:

debug: received-header: parsed as [ ip=220.166.39.177 rdns=vm.vonage.com 
  helo=vm.vonage.com by=amsfep14-int.chello.nl ident= envfrom= intl=0 
  [EMAIL PROTECTED] auth= ]

in other words "received from IP 220.166.39.177, which had both rDNS and
HELO of vm.vonage.com".  Note that it's used the HELO string as rDNS,
incorrectly.  This is the bug.

Could you open a bug on the SpamAssassin bugzilla about that?  Attach
the debug output and sample again (it can be tricky to find posts to
the mailing list when fixing the bug later otherwise).  cheers,

--j.

Paul Boven writes:
> Hi Justin, everyone,
> 
> Justin Mason wrote:
> 
> > It's worth checking this; that rule should fire only if the
> > mail really *did* come from Vonage.  I suspect a bug in how your
> > mailserver's Received headers are parsed.
> > 
> > Could you post:
> > 
> >   - a sample of a spam that passed this, with all headers
> >   - output of "spamassassin -D -L -t < spam", the lines with
> > 'received-header' and 'metadata' at least
> 
> Sure, see attachements for the the original and the output.
> 
> debug: received-header: parsed as [ ip=127.0.0.1 rdns=localhost 
> helo=localhost by=a48046.upc-a.chello.nl ident= envfrom= intl=0 
> id=k6D6gfSl010610 auth= ]
> debug: found fetchmail marker, restarting parse
> debug: received-header: parsed as [ ip=220.166.39.177 rdns=vm.vonage.com 
> helo=vm.vonage.com by=amsfep14-int.chello.nl ident= envfrom= intl=0 
> [EMAIL PROTECTED] auth= ]
> debug: received-header: relay 220.166.39.177 trusted? no internal? no
> debug: metadata: X-Spam-Relays-Trusted:
> debug: metadata: X-Spam-Relays-Untrusted: [ ip=220.166.39.177 
> rdns=vm.vonage.com helo=vm.vonage.com by=amsfep14-int.chello.nl ident= 
> envfrom= intl=0 
> [EMAIL PROTECTED] auth= ]
> debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x9d2fc4) 
> implements 'parsed_metadata'
> debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x9d2fc4) 
> implements 'parsed_metadata'
> debug: is DNS available? 0
> 
> Just to clarify: This also happens on mailservers that are directly 
> listening on port 25, not trough fetchmail. 'DNS available 0' is a 
> surprise to me, because I've hardcoded it to 'yes' in local.cf. The 
> server that I pop the email from is listed as 'trusted' in my local.cf.
> 
> Regards, Paul Boven.
> 
> 
> 
> debug: SpamAssassin version 3.0.4
> debug: Score set 0 chosen.
> debug: running in taint mode? yes
> debug: Running in taint mode, removing unsafe env vars, and resetting PATH
> debug: PATH included '/opt/csw/bin', keeping.
> debug: PATH included '/usr/bin', keeping.
> debug: PATH included '/usr/sbin', keeping.
> debug: PATH included '/opt/sfw/bin', keeping.
> debug: PATH included '/opt/Adobe/Acrobat7.0/bin', keeping.
> debug: PATH included '/usr/local/bin', keeping.
> debug: Final PATH set to: 
> /opt/csw/bin:/usr/bin:/usr/sbin:/opt/sfw/bin:/opt/Adobe/Acrobat7.0/bin:/usr/local/bin
> debug: using "/etc/mail/spamassassin/init.pre" for site rules init.pre
> debug: config: read file /etc/mail/spamassassin/init.pre
> debug: using "/opt/SpamAssassin//share/spamassassin" for default rules dir
> debug: config: read file /opt/SpamAssassin//share/spamassassin/10_misc.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_anti_ratware.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_body_tests.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_compensate.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_dnsbl_tests.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/20_drugs.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_fake_helo_tests.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_head_tests.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_html_tests.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/20_meta_tests.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/20_phrases.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/20_porn.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/20_ratware.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/20_uri_tests.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/23_bayes.cf
> debug: config: read file 
> /opt/SpamAssassin//share/spamassassin/25_body_tests_es.cf
> debug: config: read file /opt/SpamAssassin//share/spamassassin/25_hashcash.cf
> de

Re: Abuse of SARE whitelist

2006-07-13 Thread Paul Boven

Hi Justin, everyone,

Justin Mason wrote:


It's worth checking this; that rule should fire only if the
mail really *did* come from Vonage.  I suspect a bug in how your
mailserver's Received headers are parsed.

Could you post:

  - a sample of a spam that passed this, with all headers
  - output of "spamassassin -D -L -t < spam", the lines with
'received-header' and 'metadata' at least


Sure, see attachements for the the original and the output.

debug: received-header: parsed as [ ip=127.0.0.1 rdns=localhost 
helo=localhost by=a48046.upc-a.chello.nl ident= envfrom= intl=0 
id=k6D6gfSl010610 auth= ]

debug: found fetchmail marker, restarting parse
debug: received-header: parsed as [ ip=220.166.39.177 rdns=vm.vonage.com 
helo=vm.vonage.com by=amsfep14-int.chello.nl ident= envfrom= intl=0 
[EMAIL PROTECTED] auth= ]

debug: received-header: relay 220.166.39.177 trusted? no internal? no
debug: metadata: X-Spam-Relays-Trusted:
debug: metadata: X-Spam-Relays-Untrusted: [ ip=220.166.39.177 
rdns=vm.vonage.com helo=vm.vonage.com by=amsfep14-int.chello.nl ident= 
envfrom= intl=0 
[EMAIL PROTECTED] auth= ]
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x9d2fc4) 
implements 'parsed_metadata'
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0x9d2fc4) 
implements 'parsed_metadata'

debug: is DNS available? 0

Just to clarify: This also happens on mailservers that are directly 
listening on port 25, not trough fetchmail. 'DNS available 0' is a 
surprise to me, because I've hardcoded it to 'yes' in local.cf. The 
server that I pop the email from is listed as 'trusted' in my local.cf.


Regards, Paul Boven.



--- Begin Message ---

Do you like replica
joxaxajs http://www.conffortableoora.com

--- End Message ---
debug: SpamAssassin version 3.0.4
debug: Score set 0 chosen.
debug: running in taint mode? yes
debug: Running in taint mode, removing unsafe env vars, and resetting PATH
debug: PATH included '/opt/csw/bin', keeping.
debug: PATH included '/usr/bin', keeping.
debug: PATH included '/usr/sbin', keeping.
debug: PATH included '/opt/sfw/bin', keeping.
debug: PATH included '/opt/Adobe/Acrobat7.0/bin', keeping.
debug: PATH included '/usr/local/bin', keeping.
debug: Final PATH set to: 
/opt/csw/bin:/usr/bin:/usr/sbin:/opt/sfw/bin:/opt/Adobe/Acrobat7.0/bin:/usr/local/bin
debug: using "/etc/mail/spamassassin/init.pre" for site rules init.pre
debug: config: read file /etc/mail/spamassassin/init.pre
debug: using "/opt/SpamAssassin//share/spamassassin" for default rules dir
debug: config: read file /opt/SpamAssassin//share/spamassassin/10_misc.cf
debug: config: read file 
/opt/SpamAssassin//share/spamassassin/20_anti_ratware.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_body_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_compensate.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_dnsbl_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_drugs.cf
debug: config: read file 
/opt/SpamAssassin//share/spamassassin/20_fake_helo_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_head_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_html_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_meta_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_phrases.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_porn.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_ratware.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/20_uri_tests.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/23_bayes.cf
debug: config: read file 
/opt/SpamAssassin//share/spamassassin/25_body_tests_es.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/25_hashcash.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/25_spf.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/25_uribl.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/30_text_de.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/30_text_fr.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/30_text_nl.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/30_text_pl.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/50_scores.cf
debug: config: read file /opt/SpamAssassin//share/spamassassin/60_whitelist.cf
debug: using "/etc//mail/spamassassin" for site rules dir
debug: config: read file /etc//mail/spamassassin/70_sare_adult.cf
debug: config: read file /etc//mail/spamassassin/70_sare_bayes_poison_nxm.cf
debug: config: read file /etc//mail/spamassassin/70_sare_evilnum0.cf
debug: config: read file /etc//mail/spamassassin/70_sare_genlsubj0.cf
debug: config: read file /etc//mail/spamassassin/70_sare_header0.cf
debug: config: read file /etc//mail/spamassassin/70_sare_html0.cf
debug: config: read file /etc//mail/spamassassin

Re: The best way to use Spamassassin is to not use Spamassassin

2006-07-13 Thread Chris Lear

* Marc Perkel wrote (12/07/06 18:30):

Catchy subject line eh?

OK - so what I mean by this is that I now use SA for about 5% of all 
incoming email. The reaso of spam is rejected before I get to SA through 
a fairly large number of tricks that allow me to determine with near 
100% accuracy things that are spam. It is none mostly through behavior 
and karma related lists. Being host blacklisted or URI blacklisted.


I don't know if it's relevant to Marc's point, but it seems to me that 
if SA was reduced to network checks only it would still be a very good 
blocker of spam. And perhaps what Marc is doing is, more or less, moving 
SA's network checks into the MTA and using them to reject rather than 
just score.


I suppose something similar would be to score all the URIBL rules and 
RCVD_IN rules high, and abandon the traditional regex rules.


Network checks are easily the most hit spam rules in SA anyway. Here's a 
bit of sa-stats for spam on a machine I look after (the MTA blocks based 
on sbl-xbl.spamhaus.org before anything gets to SA, so that's not 
represented here):


   1BAYES_99
   2URIBL_BLACK
   3URIBL_SBL
   4URIBL_JP_SURBL
   5URIBL_OB_SURBL
   6RCVD_IN_SORBS_DUL
   7RCVD_IN_NJABL_DUL
   8HTML_MESSAGE
   9FORGED_RCVD_HELO
  10URIBL_SC_SURBL
  11URIBL_WS_SURBL
  12SARE_MLB_Stock6
  13URIBL_AB_SURBL
  14SARE_MLB_Stock1
  15STOCK_NAME_FVGT1



Of course that 5% is very important because that is where I get the
data for the other tests that allow me to bypass filtering.


Even this isn't necessarily so. Data for network tests can be collected 
automatically, by trapping spammers who trawl the web/usenet for 
addresses, those who scan for open port 25s, or those who try high MX's. 
So at least some useful data can be collected without SA, or even human 
intervention.



But - I
want you all to start thinking of a new way to look at spam
filtering.


I'm not sure this is a "new way to look at spam filtering", but I agree 
that content testing against regular expressions is increasingly looking 
like a crude and easily-outwitted technique compared to dns tests. Bayes 
is still good, though.


SA RPMs for Suse 10.0?

2006-07-13 Thread Paul Hutchings
I have a Suse 10.0 system running Spamassassin from the Suse RPMs that
used to be on Carsten Hoegers Suse FTP folder.

They don't appear to be there any more, so I wondered if anyone had any
pointers to compatible RPMs to get me up to 3.1.3?

I know I could uninstall the existing Spamassassin RPMs and grab it from
CPAN, I don't know if that might cause more problems than a simple RPM
upgrade.

cheers,
Paul
--
Paul Hutchings
Network Administrator, MIRA Ltd.
Tel: 44 (0)24 7635 5378, Fax: 44 (0)24 7635 8378
mailto:[EMAIL PROTECTED]


Re: BAYES_99 makes lots of false-positive

2006-07-13 Thread Johann Spies
On Thu, Jul 13, 2006 at 03:17:05PM +0800, Joshua, C.S. Chen wrote:
> Hello folks,
> My users speak Chinese. I found that spamassassin seems not working well
> about chinese chset (utf8 or big5) on the bayes issue. Many normal mails
> (almost) get BAYES_99 score although the real spam also get BAYES_99. It
> looks like foreign language like Chinese is very easy to be high bayes
> scored.
> I have setup ok_locales all but it doesn't help the false-positive problem.
> 
> And another question: just wonder what if I do sa-learn --dump? Am I
> supposed to see the phrase that SA has learned? some key phrases, words
> in the spam mails? If so, can I see some chinese phrases?

Do you use chinese emails to "feed" the spamfilter both ham and spam
regularly?  That would probably be the best way to improve the accuracy
of the Bayesian filter.

Regards
Johann
-- 
Johann Spies  Telefoon: 021-808 4036
Informasietegnologie, Universiteit van Stellenbosch

 "Let your character be free from the love of money,
  being content with what you have; for He Himself has
  said, "I will never desert you, nor will I ever
  forsake you."
  Hebrews 13:5


Re: Abuse of SARE whitelist

2006-07-13 Thread Justin Mason

It's worth checking this; that rule should fire only if the
mail really *did* come from Vonage.  I suspect a bug in how your
mailserver's Received headers are parsed.

Could you post:

  - a sample of a spam that passed this, with all headers
  - output of "spamassassin -D -L -t < spam", the lines with
'received-header' and 'metadata' at least

--j.

Paul Boven writes:
> Hi everyone,
> 
> Paul Boven wrote:
> 
> > One of my users just spotted a FN that had managed to slip trough. 
> > They're abusing 70_sare_whitelist.cf, specifically:
> > 
> > whitelist_from_rcvd   [EMAIL PROTECTED] vonage.com
> >   # Vonage voice mail notification
> 
> I'm now catching these on several mailservers that we run, so I'm 
> assuming this is getting abused quite a bit. And it's very effective 
> because the default score for whitelist_from_rcvd is -100. What worries 
> me is that whitelist_from_rcvd gets triggered, even though the mail 
> obviously is forged, unless vonage sends their mails from China.
> 
> So my question is, still, why does the email (see my previous posting 
> for headers) hit the whiltelist_from_rcvd? Is my trusted networks 
> confused? Does it get hit because the mail was processed by the 
> (trusted) backup-MX first?
> 
> Regards, Paul Boven.


Re: The best way to use Spamassassin is to not use Spamassassin

2006-07-13 Thread Bart Schaefer

On 7/12/06, Marc Perkel <[EMAIL PROTECTED]> wrote:


Depends on what he's doing it might work.


He's writing procmail recipes.  He's a user on a hosted shell server,
not a sysadmin.  Strictly delivery-time header text analysis, no
MTA-level configuration games.


For example, anyone can do this trick. Set your highest MX record


I'm amused by your definition of "anyone."


(add a new one) to an IP address that doesn't exist.


We actually tried that (really, we set it to point to a virtual IP on
the same server that is the primary MX, so that one was only available
when the primary also was), and had a dummy port 25 listener on that
IP to 554 everything that connected.  It stopped about 1% of our spam;
when we had to change hardware we didn't bother bringing it along.  As
I recall it worked slightly better to make it the second MX rather
than the highest one.

We're wandering a bit off topic here, though.


No dns lookup in received header for trusted

2006-07-13 Thread Schmid Benoit

Hello,

I have a problem with my trusted network config.

An email with the following header received line
does not work with trusted network.
---
Received: from CONVERSION-DAEMON.romeo.unige.ch by romeo.unige.ch
 (PMDF V6.2-1x9 #31144) id <[EMAIL PROTECTED]>; Mon,
 03 Jul 2006 17:10:03 +0200 (MEST)
---

But CONVERSION-DAEMON.romeo.unige.ch is registered in the dns
as an ip address that belongs to my trusted network.

Therefore it seems that SA is not doing a dns lookup to
check that it is in trusted network.

What sould I do so that dns lookup is used when no ([ip address])
is present in the received header line?

Thanks in advance for your help.
--
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/

  Benoit Schmid  Tel: (++41-22) 379-7209
  UNIGE Postmaster

  University of Geneva - Information Technology Division

_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


Re: Abuse of SARE whitelist

2006-07-13 Thread Paul Boven

Hi everyone,

Paul Boven wrote:

One of my users just spotted a FN that had managed to slip trough. 
They're abusing 70_sare_whitelist.cf, specifically:


whitelist_from_rcvd   [EMAIL PROTECTED] vonage.com
  # Vonage voice mail notification


I'm now catching these on several mailservers that we run, so I'm 
assuming this is getting abused quite a bit. And it's very effective 
because the default score for whitelist_from_rcvd is -100. What worries 
me is that whitelist_from_rcvd gets triggered, even though the mail 
obviously is forged, unless vonage sends their mails from China.


So my question is, still, why does the email (see my previous posting 
for headers) hit the whiltelist_from_rcvd? Is my trusted networks 
confused? Does it get hit because the mail was processed by the 
(trusted) backup-MX first?


Regards, Paul Boven.


BAYES_99 makes lots of false-positive

2006-07-13 Thread Joshua, C.S. Chen
Hello folks,
My users speak Chinese. I found that spamassassin seems not working well
about chinese chset (utf8 or big5) on the bayes issue. Many normal mails
(almost) get BAYES_99 score although the real spam also get BAYES_99. It
looks like foreign language like Chinese is very easy to be high bayes
scored.
I have setup ok_locales all but it doesn't help the false-positive problem.

And another question: just wonder what if I do sa-learn --dump? Am I
supposed to see the phrase that SA has learned? some key phrases, words
in the spam mails? If so, can I see some chinese phrases?


Cheers
Joshua



Re: sa-learn slow with Bayes and PostgreSQL

2006-07-13 Thread Paul Boven

Hi Michael, everyone,

Michael Parker wrote:

I notice that using sa-learn with SQL now is very slow compared to file db.
Is this normal, and is accessing the db while scanning mail any slower with
SQL?



Yes.  Check out the benchmarks here:
http://wiki.apache.org/spamassassin/BayesBenchmarkResults


Thanks for doing that benchmark!
Qeustion: what exactly is the difference between DB_File and SDBM.pm? 
I've never heard of SDBM.pm one and it seems a lot faster. It seems to 
be part of my SpamAssassin 3.1.1 installation, but we use DB_File to 
(with mimedefang, I think).


Regards, Paul Boven.