Re: Auto Learn Spam

2010-04-28 Thread Dennis B. Hopp

On Wed, 2010-04-28 at 11:53 -0400, Carlos Mennens wrote:
 I noticed when reviewing headers today that there was a section for
 'autolearn=no' and was wondering what exactly does this mean and
 wouldn't autolearn be a good thing? I use Amavisd-new which calls out
 to SpamAssassin modules but I don't have the spamd daemon running
 physically. The Amavisd-new daemon simply loads the modules for spamd
 and does the scoring directly saving my mail server from running more
 daemon's and system resources that it needs to. So below are the
 headers:
 

Autolearn kicks in at certain scores.  I believe the default is 12.0 for
spam and 0.1 for ham.  You can customize those settings in your local.cf
file.

bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam -3.0
bayes_auto_learn_threshold_spam 12.0

I changed the default value for nonspam because the majority of my users
don't train bayes and so the default value could cause bayes to learn
incorrectly if a spam message scored low (maybe no network rules or URI
rules triggered the first few times).

 X-Spam-Status: No, score=2.808 tagged_above=-999 required=5
 tests=[BAYES_50=0.8, HTML_IMAGE_ONLY_24=1.618, HTML_MESSAGE=0.001,
 HTML_MIME_NO_HTML_TAG=0.377, MIME_HTML_ONLY=0.723,
 RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01]
 autolearn=no
 

This particular message scored a 2.808 so it's not high or low enough
for bayes to know which way it should learn the message.

--Dennis



Re: Auto Learn Spam

2010-04-28 Thread Dennis B. Hopp

On Wed, 2010-04-28 at 12:38 -0400, Carlos Mennens wrote:

 I checked /etc/mail/spamassassin/local.cf just now and found only the 
 following:
 
 required_hits 5
 report_safe 0
 rewrite_header Subject [SPAM]
 
 However I don't know if Amavisd-new is looking at local.cf because I
 show parameters in my amavisd.conf file for SpamAssassin:
 
 $sa_tag_level_deflt  = -999.0;  # add spam info headers if at, or
 above that level
 $sa_tag2_level_deflt = 5.0; # add 'spam detected' headers at that level
 $sa_kill_level_deflt = 8.0; # triggers spam evasive actions (e.g.
 blocks mail)
 $sa_dsn_cutoff_level = 10;  # spam level beyond which a DSN is not sent
 $sa_quarantine_cutoff_level = 12; # spam level beyond which quarantine is off
 $penpals_bonus_score = 8;# (no effect without a @storage_sql_dsn database)
 $penpals_threshold_high = $sa_kill_level_deflt;  # don't waste time on hi spam
 

These settings are for amavisd-new and not spamassassin.  Amavisd-new is
the glue between your MTA and spamassassin (and virus scanners).  Most
of the behavior of spamassassin is still controlled through the local.cf
(although some settings can be defined in both places and the
amavisd.conf file will take precedence).

 $sa_mail_body_size_limit = 400*1024; # don't waste time on SA if mail is 
 larger
 $sa_local_tests_only = 0;# only tests which do not require internet 
 access?
 [...]
 $sa_spam_subject_tag = '***SPAM*** ';
 $defang_virus  = 1;  # MIME-wrap passed infected mail
 $defang_banned = 1;  # MIME-wrap passed mail containing banned name
 # for defanging bad headers only turn on certain minor contents categories:
 $defang_by_ccat{+CC_BADH.,3} = 1;  # NUL or CR character in header
 $defang_by_ccat{+CC_BADH.,5} = 1;  # header line longer than 998 characters
 
 When I get a spam message that was scored by SA, it says ***SPAM***
 and not [SPAM] so that leaves me to believe that SA parameters are
 being fed from amavisd.conf file. Does this make sense to you guys?

This is just the setting in amavisd.conf taking precedence.  If you were
to comment out $sa_spam_subject_tag I *believe* the value in your
local.cf would then be used.




Re: multiple instances

2010-04-16 Thread Dennis B. Hopp

On Fri, 2010-04-16 at 10:08 -0700, Gary Smith wrote:
 I have a need to run several different instances of SA on a single box (in 
 development).  In production, we have 3  different SA environments (with 2+ 
 servers each) that have different rule sets and specific routing rules 
 determine which instance it gets sent to.   We need to mimic this in 
 development.  
 
 Ideally I would like to create all 3 instances (*2 mimicing load balancing) 
 on a single development box.  We're not worried about the performance or 
 memory aspect.
 
 Is this possible, and if so, is there an easy way to do this.   I was 
 thinking that I could create separate chroot environments for each one if 
 necessary and either bind each instance to an IP (which I'm not sure if 
 that's possible) or at least a different port.
 
 Any advice (or some sample scripts on doing this) would be greatly 
 appreciated.
 

I'm sure it's possible, but rather than going through all the work of
trying to script and setup chroot environments, why not use VMs?  You
can then quite literally match the production setup.

Since you are not worried about performance or memory you could give
each VM 128 MB of RAM and only be using 1 GB or so total...

--Dennis



Re: Quarantine Management

2010-04-10 Thread Dennis B. Hopp

Quoting Alex mysqlstud...@gmail.com:


Hi,

Just wondering what other tools are out there that people like.

I use postfix as my MTA right now, but am not completely opposed to using
something else if necessary to use a specific quarantine system.


Amavisd-new works well with postfix


maia mailguard using amavisd-new but an old version.



I think he's probably referring to something that would help him
manage the quarantine itself, such as to query it for FNs, provide
some type of reporting, forward FPs back to the proper recipient,
manage expiry, expunging, and scoring, etc?


Yes exactly what I'm referring to.  Wishlist would be:

User controllable (i.e users can release spam messages back into their  
mailbox)

Whitelist/blacklist management
Domain configurations

maia mailguard has pretty much all of that but hasn't been updated in  
a while, just looking for other possibilities.


Do people just flag the message as spam (maybe in the header) and then  
let users filter to a spam folder?  We are using this as a front end  
to exchange so I guess we could just flag it and then have exchange  
deliver it to the users Junk E-mail folder, but then bayes can't  
learn from its mistakes as easily.


--Dennis


AWL

2010-04-09 Thread Dennis B. Hopp
I have AWL enabled and it seems to be ok with helping out legitimate
senders that occasionally send a spammy type message, but lately I
have seen an increase where AWL is adding a negative score to a very
blatant spam.  

So my questions are, do people feel AWL is worth having enabled?  

Is there a way to have the AWL rule only triggered if there is a minimum
number of messages seen by that sender?

--Dennis



Re: AWL

2010-04-09 Thread Dennis B. Hopp

 Not that I'm aware of.
 
 Is the AWL score enough to prevent the messages from being marked as
 spam, or are you seeing the negative AWL score on messages that are
 marked as spam?  It is normal for AWL to give negative scores to spam
 from time to time, but for the most part, it should not be enough to
 push the score below the spam threshold.

Not usually, but I have seen a few messages that triggered BAYES_99 or
BAYES_95 and then a few other rules that pushed the score to just above
5.0 (which is what I block at) and then AWL will come in with say a
-0.35 and drop the overall score to 4.8.

I know how AWL works and occasionally it will lower the score of a spam,
but it just seems to be happening more often lately.  I store my AWL in
mysql so I just deleted all entries that have a count of less then 20.
I think pretty much every time this happens the AWL count is low (maybe
3 or 4). 

--Dennis



KHOP_RCVD_TRUST

2010-03-26 Thread Dennis B. Hopp
I received the following e-mail

http://pastebin.com/JXr9buxi

It had a total score of 4.973 (blocked at 5).  Among other rules it hit:

KHOP_RCVD_TRUST=-1.75,RCVD_IN_DNSWL_MED=-0.5,SPF_PASS=-0.001

So is the KHOP_RCVD_TRUST score too low?  Should I possibly consider
making that -0.75 or something?  Is there a way to report FP to KHOP?

Thanks,

--Dennis




Re: KHOP_RCVD_TRUST

2010-03-26 Thread Dennis B. Hopp

On Fri, 2010-03-26 at 11:35 -0400, Michael Scheidell wrote:
 
 On 3/26/10 10:41 AM, Dennis B. Hopp wrote:
  I received the following e-mail
 
  http://pastebin.com/JXr9buxi
 
  It had a total score of 4.973 (blocked at 5).  Among other rules it hit:
 
  KHOP_RCVD_TRUST=-1.75,RCVD_IN_DNSWL_MED=-0.5,SPF_PASS=-0.001
 
 
 is that an old rule? i just checked SA updates, and I don't see that 
 rule in current SA 3.3.1
 
 so, who is KHOP?  I looked in rule sets and don't know them.  were these 
 rules inherited form some outside trusted source?
 
 

http://khopesh.com/wiki/Anti-spam#sa-update_channels

Some of his rules I believe have been incorporated into mainline sa.
I'm using 3.3.1.  I just got an update from some of the KHOP channels
yesterday so they appeared to be maintained.

--Dennis



Re: Upgrading to SpamAssassin 3.3

2010-03-17 Thread Dennis B. Hopp

On Wed, 2010-03-17 at 11:35 -0400, Kaleb Hosie wrote:
 Hello,
 I'm running SA 3.2.5 on CentOS 5.4 and I've noticed that a newer major 
 release has been released. The server is currently in production so I'm a bit 
 leery to upgrade.
 
 Do you feel that it is worth the upgrade to 3.3? Is there anything I should 
 know before I go ahead and upgrade?
 

I upgraded CentOS 5.4 to 3.3.0 and only ran into one issue which had
nothing to do with spamassassin.  The ugprade of spamassassin went fine
but I use it with maia-mailguard and the current stable version of
maia-mailguard does not work correctly with 3.3.0.  There is a patch in
the svn for maia that fixes the issue.

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-12 Thread Dennis B. Hopp

 describe FORGED_HOTMAIL   Hotmail with non-Hotmail Reply-to address
 header   __FORGED_HM1 From ~= /\...@hotmail\.com/i
 header   __FORGED_HM2 Reply-to ~= /\...@hotmail\.com/i
 meta FORGED_HOTMAIL   (__FORGED_HM1  !__FORGED_HM2)
 scoreFORGED_HOTMAIL   5.0
 
 and write cookie cutter rules for Yahoo and Gmail. 
 
 OTOH if you're happy that a Japanese test won't generate FPs you can
 cover all three ISPs with one rule:  
 
 describe FORGED_FROM Hotmail,Yahoo or Google with Japanese Reply-to 
 header   __FF1   From ~= /\@(hotmail|yahoo|gmail)\.com/i
 header   __FF2   Reply-to ~= /\.jp/i
 meta FORGED_FROM (__FF1  __FF2)
 scoreFORGED_FROM 5.0
 
 Of course, if its just a few Japanese ISPs being used you can easily
 make _FF2 more specific.
 

I tried this for yahoo...

describe FORGED_YAHOO Yahoo with non-Yahoo Reply-to address
header   __FORGED_YH1 From =~ /\...@yahoo\.com/i
header   __FORGED_YH2 Reply-to =~ /\...@yahoo\.com/i
meta FORGED_YAHOO (__FORGED_YH1  !__FORGED_YH2)
scoreFORGED_YAHOO 0.25

And it triggered on a message with the following header

http://pastebin.com/qs18DpYn

My best guess is it is using the In-Reply-To header...is there a way
to differentiate In-Reply-To and Reply-To ?

Thanks,

--Dennis



Re: [sa] Re: Bogus mails from hijacked accounts

2010-03-12 Thread Dennis B. Hopp


 The problem with this is that the !__FORGED_YH2 matches
 when there is *NO* Reply-To header at all!
 
 You need something like this:
 
 header __FORGED_YH2 Reply-To =~ /\@([^y]|y[^a]|ya[^h]|yah[^o])/i
 meta FORGED_YAHOO (__FORGED_YH1  __FORGED_YH2)
 
 (remove the negation from the meta)
 This directly tests for an existing Reply-To specifically to a domain
 that does not begin with 'yaho'.

Wouldn't that meta rule trigger when the reply-to contained 'yaho'?  I
want to trigger when the from contains yahoo.com and the reply-to does
not.

 
 However, keep in mind that the headers for *this* mailing list would 
 trigger your rule. So you will also need to meta this with a rule that 
 tests for yahoo mail server being the sending SMTP client
 

Good point.  I didn't think about that..

--Dennis



Re: [sa] Re: Bogus mails from hijacked accounts

2010-03-12 Thread Dennis B. Hopp

On Fri, 2010-03-12 at 12:52 -0600, Dennis B. Hopp wrote:
 
  The problem with this is that the !__FORGED_YH2 matches
  when there is *NO* Reply-To header at all!
  
  You need something like this:
  
  header __FORGED_YH2 Reply-To =~ /\@([^y]|y[^a]|ya[^h]|yah[^o])/i
  meta FORGED_YAHOO (__FORGED_YH1  __FORGED_YH2)
  
  (remove the negation from the meta)
  This directly tests for an existing Reply-To specifically to a domain
  that does not begin with 'yaho'.
 
 Wouldn't that meta rule trigger when the reply-to contained 'yaho'?  I
 want to trigger when the from contains yahoo.com and the reply-to does
 not.

Nevermind..the '^' inside brackets negates..I get it now..



Re: My First Spam Mail Today

2010-03-12 Thread Dennis B. Hopp

 My headers look like:
 
 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on mail.iamghost.com
 X-Spam-Level: *
 X-Spam-Status: No, score=1.0 required=6.3
  tests=EXTRA_MPART_TYPE,HTML_MESSAGE autolearn=no version=3.3.0
 
 *
 

The message scored a 1.0 (score=1.0) but the X-Spam-Score header
apparently wasn't added to the message.

 The above snipper shows no score as I would expect to see below from a
 different server:
 
 X-Spam-Flag: NO
 X-Spam-Score: -1.15
 X-Spam-Level:
 X-Spam-Status: No, score=-1.15 tagged_above=-999 required=5
 tests=[BAYES_00=-2.599, MSGID_MULTIPLE_AT=1.449] autolearn=no
 
 *
 
 Am I missing something in my local.cf that is not properly scoring all
 incoming messages?

In this example you also have tagged_above=-999 which leads me to
believe you are using amavisd-new.  Are both servers using
amavisd-new?  

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 1)  Spammers rotate sender addresses and hijacked account info more 
 often than most of us change our underwear.  An account *may* get 
 reused;  chances are it'll be months before it does, and the spammers 
 will have rotated through hundreds or thousands of others - both 
 phish-cracked and those set up just to send their junk.  Blacklisting a 
 sender is reduced to blocking the persistent friend-of-a-friend who 
 refuses to remove you from the endless stream of chain-forwards, and 
 legitimate-but-totally-clueless mailing list operators who can't figure 
 out how to unsubscribe you from their list.  :(
 
 2)  You noted originally that these appear to be fully legitimate 
 freemail accounts, legitimately used in the past to correspond with your 
 customers/clients, that have been compromised and then used to send 
 spam.  How do you propose to still allow the legitimate account holders 
 to email your clients if you blacklist the sender?
 

I don't want to blacklist the address, hence the reason why in my
original e-mail I said other then blacklisting.  I know blacklisting
would block these bogus e-mails as well as legit e-mails as soon as the
clients get access back (they currently don't have access to their
accounts because their passwords have been changed).  


 
 Martin's suggestion followup should point you in the right direction. 
 Sets of phrase rules (how similar are these messages?  do you have ten 
 or fifteen you can compare sentence-by-sentence?) with low scores will 
 likely help some too.  Meta rules that bump the score up depending on 
 how many phrases hit, or phrase+mismatched-sender/reply also work 
 tolerably well on this class of spam... if you can get enough samples to 
 build a complete enough set of phrase rules.

I'm going to look at what Martin suggested and compare it to what
samples I have.

Thanks,

--Dennis




Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 Its not conditional, just using a meta rule and negating the Reply-to
 test in the meta:
 
 describe FORGED_HOTMAIL   Hotmail with non-Hotmail Reply-to address
 header   __FORGED_HM1 From ~= /\...@hotmail\.com/i
 header   __FORGED_HM2 Reply-to ~= /\...@hotmail\.com/i
 meta FORGED_HOTMAIL   (__FORGED_HM1  !__FORGED_HM2)
 scoreFORGED_HOTMAIL   5.0
 
 and write cookie cutter rules for Yahoo and Gmail. 
 
 OTOH if you're happy that a Japanese test won't generate FPs you can
 cover all three ISPs with one rule:  
 
 describe FORGED_FROM Hotmail,Yahoo or Google with Japanese Reply-to 
 header   __FF1   From ~= /\@(hotmail|yahoo|gmail)\.com/i
 header   __FF2   Reply-to ~= /\.jp/i
 meta FORGED_FROM (__FF1  __FF2)
 scoreFORGED_FROM 5.0

Thanks Martin.  This is actually far simpler then I was thinking it
would be.

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 I don't think the accounts were hijacked: the headers showed that the
 messages the OP posted were not sent from the domain hosting the mail
 accounts. It looked to me as if somebody has sold on lists of valid
 hotmail etc. accounts.
 
 I smell an inside job, or at least some careful preparation, because the
 OP reckons that these accounts (forged as sender) were paired with valid
 accounts he hosts that would be used by the owner of the forged account.
 The messages I saw took the form:

We got one owner of the hijacked accounts to admit he got an e-mail that
basically said Hi we are trying to get rid of dead accounts so please
click here to verify your information.  The site then very nicely asked
for his username/password which he gave and then viola, no more access
to his account.  The message was then sent to every address in his
address book (which is why many of my users got the same message). 

Sadly, we have had this happen a couple of times with hotmail and yahoo 
addresses.

What can I say, some of our clients aren't exactly the most tech savvy.

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 ...and I suppose the same would apply to social networks. I don't use
 either, so am somewhat clueless about what goodies are available if you
 can access their accounts.
 

I have some free e-mail accounts that I use as throw away accounts.
When a site just HAS to have a valid e-mail so you can read the news
article or whatever.  I might login to the accounts about once a month.

   The one of these I encountered at $DAYJOB was sent to the account
  owner's wife's ex-husband-- not my first choice when asking for emergency
  funds. The email also claimed he was traveling in London-- the guy AFAIK
  hasn't left Texas, let alone the US, in the past few years-- and used a
  number of phrases that a native speaker of American so-called-English
  wouldn't.
 
 OK, looks like I hugely overestimated the intelligence of recipients of
 such scams and hence the care needed to target an attack.
 

It's a sad thing, but a lot of people fall for stupid scams every day...



Bogus mails from hijacked accounts

2010-03-10 Thread Dennis B. Hopp
We seem to be having a problem where clients that we interact with
regularly are having their hotmail/gmail/yahoo accounts hijacked.  We
are receiving e-mails from their accounts that legitimately go through
the correct servers (hotmail,yahoo, etc.) and so they get passed through
our spam filters.  The messages have different bodies but basically say
the same thing that they were on vacation and had all their money stolen
so they need to have money wire transferred to them.

Obviously we just have to tell the clients that they need to deal with
the various e-mail providers, but is there an effective way that I can
filter these messages out before my users see them without blacklisting
the address?  In one case I had probably 15 users that received the same
message and naturally they freaked out.

I have put a sample at:

http://pastebin.com/9BDXrxmm

Note I did change the real e-mail address in this message but the
hotmail address used is valid just masked.

The message doesn't hit any rules of significance on my system.

BAYES_00=-1.9,FREEMAIL_FROM=0.001,HTML_MESSAGE=0.001,RCVD_IN_DNSWL_NONE=-0.0001,SPF_PASS=-0.001,T_RP_MATCHES_RCVD=-0.01,T_TO_NO_BRKTS_FREEMAIL=0.01


Thanks

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-10 Thread Dennis B. Hopp

On Wed, 2010-03-10 at 20:22 +, Martin Gregorie wrote:
 On Wed, 2010-03-10 at 13:37 -0600, Dennis B. Hopp wrote:
 
  Obviously we just have to tell the clients that they need to deal with
  the various e-mail providers, but is there an effective way that I can
  filter these messages out before my users see them without blacklisting
  the address?
 
 There's nothing in SA that can blacklist a sending MTA, so blacklisting
 can't happen unless you've added something to your MTA set-up that does
 auto-blacklisting.
 

I meant blacklisting the sender address, not the MTA.

 The question then comes down to marking the message as spam and dealing
 with it however you normally deal with spam. You'll probably need custom
 rule(s) to handle that. You say the message bodies are quite variable,
 but I notice that the Reply-to: header doesn't remotely match the From:
 header. Is this a common factor?
 

The ones that I have seen the reply-to doesn't match the from and I
think the reply-to have all been something.jp

 If it is, and the body texts have no common features that could also be
 used, the only obvious approach would be a rule for each forged sending
 domain that fires if the sending domain doesn't match the Reply-to
 domain. 
 

There isn't anything in common that I can see that wouldn't be
susceptible to false positives.  One even left the clients signature
intact.  I've written fairly simple custom rules before but I'm not sure
how to do conditional rules.  I'll have to dig into the docs a little
more.
 
 Only you can know if these rules would cause false positives: I can't
 possibly tell from a single sample message.
 

I wasn't expecting anybody to give me a magic rule that would fix it,
just suggestions since I would only be able to blacklist the sender
address after the e-mail had been received and I was notified of the
problem.  And obviously blacklisting all of gmail/hotmail/yahoo isn't an
option.

Thanks,

--Dennis



Re: Bogus Dollar Amounts

2010-02-25 Thread Dennis B. Hopp

Quoting Kai Schaetzl mailli...@conactive.com:


Dennis B. Hopp wrote on Wed, 24 Feb 2010 09:14:58 -0600:

Obviously I have something going on with my bayes, but that's a   
separate issue


Indeed. But it's an important issue. If it is that biased for other   
spam as well

youa re better off to not use it in this state.

X-Spam-Status: No, score=2.8 required=5.0 tests=BAYES_50,HK_MUCHMONEY,
T_LOTS_OF_MONEY,UNPARSEABLE_RELAY autolearn=no version=3.3.0

add your RBL score and it's way over 5.



I agree it's an important issue.  I had turned off bayes autoexpire in  
local.cf and at some point taken the cron job out that did a manual  
force-expire.  Once I did a force expire BAYES_60 triggered rather  
then BAYES_00.


What is the HK_MUCHMONEY rule that you have?  Is that part of the base  
SA installation?


Thanks,

--Dennis


Bogus Dollar Amounts

2010-02-24 Thread Dennis B. Hopp
I have been seeing a few spam mails slip past that talk about being  
able to get bogus dollar amounts.  What I mean by that is it will give  
a large value in the e-mail but where there should be a comma it puts  
a period.


I put an example of one of these messages at:

http://pastebin.com/SXuGELUS

Are there any rules that can detect this?  The only rules this hit on  
mine are:


1.900   DCC_CHECK
1.449   RCVD_IN_BRBL_LASTEXT
1.000   RCVD_IN_BRBL
-0.001  SPF_PASS
-0.010  T_RP_MATCHES_RCVD
-1.900  BAYES_00

Obviously I have something going on with my bayes, but that's a separate issue

Thanks,

--Dennis


Re: Bogus Dollar Amounts

2010-02-24 Thread Dennis B. Hopp

Nevermind...it was also hitting

T_LOTS_OF_MONEY

and once I expired old bayes tokens it no longer hit BAYES_00.  Now I  
just have to figure out whats up with my bayes db.


--Dennis

Quoting Dennis B. Hopp dh...@coreps.com:


I have been seeing a few spam mails slip past that talk about being
able to get bogus dollar amounts.  What I mean by that is it will give
a large value in the e-mail but where there should be a comma it puts a
period.

I put an example of one of these messages at:

http://pastebin.com/SXuGELUS

Are there any rules that can detect this?  The only rules this hit on
mine are:

1.900   DCC_CHECK
1.449   RCVD_IN_BRBL_LASTEXT
1.000   RCVD_IN_BRBL
-0.001  SPF_PASS
-0.010  T_RP_MATCHES_RCVD
-1.900  BAYES_00

Obviously I have something going on with my bayes, but that's a   
separate issue


Thanks,

--Dennis





Re: Bogus Dollar Amounts

2010-02-24 Thread Dennis B. Hopp



It is common in many parts of the world to use a period instead of a
comma as a digit group separator, and vice-versa for the decimal
separator.

http://en.wikipedia.org/wiki/Thousands_separator#Digit_grouping



I knew it was common in other parts of the world, but for some reason  
was thinking that when referring to US Dollars it wouldn't be.  Now  
that I think about it I can understand why my original thought was  
wrong.


I guess it doesn't really matter since the message was actually  
hitting another rule (T_LOTS_OF_MONEY) that I somehow missed.


--Dennis



Re: mail slipping through

2009-08-19 Thread Dennis B. Hopp

Quoting Gary Smith gary.sm...@holdstead.com:

I've been having a pretty good hit rate on spam until recently   
(about two weeks).  Two types of email have been coming through at a  
 good rate.  I'm receiving at least four per hour from the domains   
included below.  I've also been training bayes with them as well, to  
 no avail.


Is it pretty much the same body, just different senders?



*...@chocolatebearbear .INFO
*...@biblegame .info
*...@clickbetterthere .info



If it's just the senders you could easily blacklist the domains, none  
of these domains look all that legit.


Can you copy a message or two (with full headers) to pastebin so we  
can have a look?


--Dennis


Re: Cant Post Message

2009-07-31 Thread Dennis B. Hopp

Quoting twofers twof...@yahoo.com:

I have a post I have tried several times over the last week to post   
to this forum and it never seems to get posted. I don't understand   
why?

 
There is nothing exotic about it, just text, a question and email   
header info I pasted.

 
Any idea whats up?
 
Thanks,
 
Wes





Try putting the header on a site like www.pastebin.com and then put  
the link in your e-mail rather then the actual header.


--Dennis


Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting LuKreme krem...@kreme.com:


On Jul 30, 2009, at 18:12, Dennis B. Hopp dh...@coreps.com wrote:
Yeah I knew that.  I have a few negative scoring rules but not many  
 (outside of what might be in the misc rules sets I have).  What is  
 a good threshold for ham then?


5.0 is the score SA us designed for. It's a very good number in almost
all cases.


I meant the threshold for bayes auto learn to learn the message.  I'll  
try switching back to the default values.


Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting RW rwmailli...@googlemail.com:


On Fri, 31 Jul 2009 03:55:48 +0200
Karsten Bräckelmann guent...@rudersport.de wrote:



The default of 0.1. It's a default for a reason.

But that *really* is not your problem. Your problem is with learning
spam, not learning even more ham. Just as you mentioned in your
original report. See my previous response for a solution. You want to
learn more spam.


What he actually wrote was that 3.7% of _all_messages_ were hitting
hitting BAYES_00, and 1.7% were hitting BAYES_99.

If he actually meant what he wrote and doesn't have an extraordinary
spam/ham ratio, then he clearly has a problem with both spam and
ham.



I cleared my maia statistics a couple of days ago.  Since then  
BAYES_00 has triggered 4510 times, BAYES_99 2366 times and BAYES_50  
1568 (all the other BAYES_XX are less then 1000 times).  In those same  
couple of days we have processed about 45,000 messages (this is the  
number of messages that actually reached spamassasin and wasn't out  
right rejected).  So my initial percentages were way off (I was going  
by maia mailguards sa rule statistics).  So roughly 10% of mail is  
hitting BAYES_00 and 5% is hitting BAYES_99.  It seems to me that  
BAYES_99 should probably be triggered more often then BAYES_00.


If there is a better way to get sa statistics I'd be happy to know.

I know that the bayes success rate comes down to training, but like  
every other administrator I can't possible check every message for  
accuracy and I was hoping to make the auto learn a little better.  I  
thought maybe I just didn't have enough rules (both negative and  
positive scoring) to trigger the auto learn often enough.


Thanks,

--Dennis



Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting John Hardin jhar...@impsec.org:


On Fri, 31 Jul 2009, Dennis B. Hopp wrote:

I cleared my maia statistics a couple of days ago.  Since then   
BAYES_00 has triggered 4510 times, BAYES_99 2366 times and BAYES_50  
 1568 (all the other BAYES_XX are less then 1000 times).


Do they all add up to about 45,000?



No they don't.  I see some messages that trigger no rules at all  
(Bayes or otherwise).  I thought that was odd since I thought a bayes  
rule should trigger pretty much all the time.


In those same couple of days we have processed about 45,000   
messages (this is the number of messages that actually reached   
spamassasin and wasn't out right rejected).


If there is a better way to get sa statistics I'd be happy to know.


sa_stats.pl from the SARE website.

http://www.rulesemporium.com/programs/


I'll take a look.  Will this works with logs that are written by amavisd-new?

Thanks,

--Dennis



Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting Karsten Bräckelmann guent...@rudersport.de:


On Fri, 2009-07-31 at 06:07 -0700, John Hardin wrote:

On Fri, 31 Jul 2009, Dennis B. Hopp wrote:

 I cleared my maia statistics a couple of days ago.  Since then
BAYES_00 has

 triggered 4510 times, BAYES_99 2366 times and BAYES_50 1568 (all the other
 BAYES_XX are less then 1000 times).

Do they all add up to about 45,000?


Doh!  Good catch, John.

No, they cannot possibly.  Do the math. These 3 rules are less than 10k,
remaining 35k. Each less than 1k hits means we need another  35 rules.
However, there are merely 6 ones left.

  $ grep -c BAYES_ 50_scores.cf
  9

The stats are incorrect.  Well, unless the lions share is processed with
Bayes disabled, or otherwise not processed by SA.


I do have sanesecurity rules in clamav which may be filtering messages
before spamassassin sees them which would account for some of the
difference between the total BAYES triggered and messages received.
We also relay all outbound mail through these same servers but do not
send outbound mail through spamassassin which again would make for  
some difference.  I should have thought to mention that before.


I couldn't get sa-stats to give me any useful information.  I did get
amavis-logwatch and I am not sure if I like what it's showing me.  I ran it
against the last few maillogs I have so it encompasses basically the  
last month.  Here is the relevant parts of the output:


http://pastebin.com/m59ddaf1d

If I'm reading that correctly less then 50% of mail is actually
being filtered (seems like it should be higher then that). Those stats  
don't count the messages we completely reject.  We don't reject solely  
on one RBL but use policy-weightd to reject messages.  I guess I could  
just let all messages through to SA for a few days to see how things  
change, but I don't see the point of wasting CPU/Memory for messages  
that are pretty much guaranteed spam.


Here is the stats on my postfix:

http://pastebin.com/m15d2533e

Maybe I'm worried about nothing but given some of the spam that I get  
forwarded that gets through (some very obvious spam) and then to see  
what rules it hits just makes me think that something isn't quite right.


--Dennis




Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting Karsten Bräckelmann guent...@rudersport.de:


If I'm reading that correctly less then 50% of mail is actually
being filtered (seems like it should be higher then that). Those stats


Actually, the numbers you gave for the last couple days are even
lower. About one third, 15k out of 45k do have a BAYES_xx hit and thus
are scanned by SA.

I told you how to train your Bayes, if you're not satisfied with the
result. Whether you like it not, there really isn't an other way. FWIW,
blocking the obvious offenders early seems like a proper explanation for
Bayes not showing a lot of high hitters.


Yes you did and I'm going to set something up to make a copy of the  
messages that trigger BAYES_20 through BAYES_80 into a separate  
mailbox that I can then inspect periodically for a while (while still  
letting the message be delivered to the user)




Anyway, considering the back and forth -- IMHO, you *first* should get a
clear picture how exactly your mail is being processed. I don't feel
like stabbing in the dark.



And I don't expect you to take a stab in the dark.  The 45K messages  
was the total processed inbound and outbound which I didn't think  
about that outbound is not funneled through SA and so would not be  
seen in BAYES.  So I admit, it was a poor analysis on my part.





Maybe I'm worried about nothing but given some of the spam that I get
forwarded that gets through (some very obvious spam) and then to see
what rules it hits just makes me think that something isn't quite right.


Forwarded -- as in reports by your users, or forwarded from external MXs
to yours? In the latter case, the obvious thing to check is your
internal and trusted network settings.



Forwarded from internal users asking how it got through the spam  
filters.  I rarely get reports to our abuse/postmaster addresses (with  
the exception of AOL users who mark messages as spam when they clearly  
are not spam).


Number of rules

2009-07-30 Thread Dennis B. Hopp
I'm using maia-mailguard with spamassassin 3.2.5.  For the most part  
it seems to be working ok but I feel like too many messages are  
hitting BAYES_00 (roughly 3.7% of all messages) and BAYES_99 is only  
hitting about 1.7%.  I have bayes autolearn on with ham being learned  
at -1.0 and spam learned at 8.0


I'm sort of thinking part of my problem is I just don't have enough  
rules so I'm curious how many rules do other users out there have in  
their spamassassin setup?


I currently have about 2558 rules consiting of stock rules, SOUGHT,  
KHOP, SARE, some customer rules I wrote and various rules I've seen  
posted on this list and other sites.  I have a few plugins enabled as  
well (FreeMail, iXhash, Botnet, ASN, Pyzor, Razor2, DCC)


I know some of it is just training of the bayes but I'm wondering if  
just lack of rules might be causing some of my problems.


Thanks,

--Dennis



Re: Number of rules

2009-07-30 Thread Dennis B. Hopp

Quoting RW rwmailli...@googlemail.com:



Bear in mind that autolearning uses it's own version of the score that
excludes whitelisting and Bayes, which means that very little ham will
reach the -1 threshold unless you've added your own site-specific rules
for identifying it.



Yeah I knew that.  I have a few negative scoring rules but not many  
(outside of what might be in the misc rules sets I have).  What is a  
good threshold for ham then?


--Dennis