Re: Describing AWL

2011-03-07 Thread Dennis German

On 3/7/11 4:13 PM, John Hardin wrote:
On Mon, 7 Mar 2011, Adam Katz wrote:
On 03/06/2011 11:33 AM, Karsten Br�ckelmann wrote:
On Sun, 2011-03-06 at 10:51 -0800, JP Kelly wrote:
 I just found an incoming message which is ham but marked as spam.
 It received a score of 14 because it is in the auto white-list.
 Shouldn't it receive a negative score?
 http://wiki.apache.org/spamassassin/AwlWrongWay

 Despite its name, the AWL is a score averager, based on the sender's
 history (limited by net-block).
 I encountered that misconception so much that I altered its description
 it in my local.cf:
 describe AWL Adjust score towards average for this sender
 As a reminder, SVN trunk uses:
 describe AWL From: address is in the auto white-list
 Even if we don't change what AWL means, we don't need to spell it out
 as often. Cleaning up the docs would certainly be useful, but simply
 changing the description would cover most of the ground for us.
 Open a boog for it.

I prefer to call AWL HEAT ( Heuristic Email Address Tracking )

You might be interested in my version of a utility sa-heatu documented at

http://www.real-world-systems.com/mail/sa-heatu.html

I have tried to clarify how HEAT works at
http://www.real-world-systems.com/mail/sa-heatu.html#backgrnd

which adds aging so as to loose old entries otherwise kept forever.

I also have some thoughts about discarding hammers at the end of that 
document.


Any feedback on this would be welcome.
Dennis German



Re: Supporting 3.3 and 3.2?

2011-03-04 Thread Dennis German

On 3/3/11 10:09 PM, Karsten Bräckelmann wrote:

On Fri, 2011-03-04 at 03:36 +0100, Karsten Bräckelmann wrote:

On Thu, 2011-03-03 at 15:52 -1000, Warren Togami Jr. wrote:

Could we please make an official project statement that 3.2.x is
unsupported and people should really update to 3.3.x?

That said, personally, with various Open Source projects, I have never
given up support for old versions. As long as I *can* help people, I
will.
Besides, in this particular case, the *real* underlying issue of a badly
trained Bayes won't get fixed by updating. Yes, the overall score would
change drastically, as shown, but the training has been rather poor and
won't change over night by updating.

I would surely use a more recent version of SA if I could.
My hosting service uses CPanel and Centos and I cannot convince them to 
upgrade.


Re: low score for ($1.5Million)

2011-03-04 Thread Dennis German

On 3/3/11 8:06 PM, Karsten Bräckelmann wrote:

On Fri, 2011-03-04 at 01:53 +0100, Mikael Syska wrote:

I get the following hits:
Content analysis details:   (19.1 points, 5.0 required)
Note though, that your score is on SA 3.3.x, while the OP uses SA 3.2.x.
Yes, I can tell this from the scores. :)

Major changes between these version are clearly reflected in your score
and rules hit. Namely a lot of work by John Hardin to catch exactly such
fraud, and the FreeMail plugin now upstream -- with 3.2 it is available
as a third-party plugin.

  0.8 BAYES_50   BODY: Bayes spam probability is 40 to 60%
X-Spam-testscores: AWL=1.086,BAYES_00=-2.599,HTML_MESSAGE=0.001,
MILLION_USD=1.528


while the OP uses  OP means ?
Please direct me to info on FreeMail plugin.
Is it expected that I will be able to implement it given I am on a 
shared host without root access?


Karsten,
 Thank you for your continued help. We all really appreciate your efforts.




low score for ($1.5Million)

2011-03-03 Thread Dennis German

Can someone comment on the low score assigned to the email located at

http://www.cccu.us/hundredThousand.txt

X-Spam-testscores: AWL=1.086,BAYES_00=-2.599,HTML_MESSAGE=0.001,
MILLION_USD=1.528

Is my bayes broken?


Re: Collecting IP reputation data from many people

2010-10-25 Thread Dennis German
On Oct 23, 2010, at 12:31 PM, Royce Williams wrote:

 On Sat, Oct 23, 2010 at 7:31 AM, Per Jessen p...@computer.org wrote:
 Royce Williams wrote:
 
 On Fri, Oct 22, 2010 at 5:19 AM, Michael Scheidell
 michael.scheid...@secnap.com wrote:
 On 10/21/10 8:50 PM, dar...@chaosreigns.com wrote:
 
 I'd like to try collecting reputation data for every IP address from
 everyone willing to submit it.
 
 re-inventing the wheel.
 
 If what's being suggested is a non-commercial alternative to a
 commercial product, then I think that the pejorative connotations of
 re-inventing the wheel don't apply. :-)  This is a wheel that needs
 re-inventing, and begs for an RFC.
 
 http://www.roaringpenguin.com/draft-dskoll-reputation-reporting-01.txt
 
 As a fan of MIMEDefang, I should have remembered this. It looks like a
 great start.
 
 Have any products other than MIMEDefang (and its Can-IT commercial
 arm) implemented this?
 
 Royce

Have you pulled your own data from auto-whitelist ?


rule for To: undisclosed-recipients:;

2010-10-24 Thread Dennis German
Is there? should there be a rule for  a header like:
To: undisclosed-recipients:;


Spam US$350,000 not tripped

2010-10-19 Thread Dennis German
I am surprised this plain text spam did not trip for US$350,000
sa 3.2.4

http://www.Real-World-Systems.com/mail/spam.un


Re: Spam US$350,000 not tripped

2010-10-19 Thread Dennis German
On Oct 19, 2010, at 5:56 PM, Karsten Bräckelmann wrote:

 On Tue, 2010-10-19 at 22:41 +0100, Ned Slider wrote:
 On 19/10/10 22:34, Dennis German wrote:
 I am surprised this plain text spam did not trip for US$350,000
 sa 3.2.4
 
 Uhm, a generic amount of money on it's own is not a sign of spam. You
 know, some people do deal with and talk about money...
 
 It hits a stack of rules here (some are my own scoring) - looks like 
 it's time to upgrade to SA 3.3.1.
 
 *  6.0 BAYES_99 BODY: Bayes spam probability is 99 to 100%
 *  [score: 0.]
 *   25 RCVD_IN_BRBL_LASTEXT RBL: RCVD_IN_BRBL_LASTEXT
 *  [148.208.170.3 listed in bb.barracudacentral.org]
 
 Seriously? Or is that a score typo in your cf files?
 
 *  3.0 RCVD_IN_JMF_BL RBL: Relay listed in JunkEmailFilter BLACK 
 (bad)
 *  [148.208.170.3 listed in hostkarma.junkemailfilter.com]
 
 BRBL and JMF are easy enough to add to an existing 3.2.x installation.
 
 *  1.0 MISSING_HEADERS Missing To: header
 
 Stock 3.2.x, scored even slightly higher.
 
 *  3.0 JM_SOUGHT_FRAUD_3 Body contains frequently-spammed text 
 patterns
 
 Easy enough to add to 3.2.x via sa-update. Recommended.
 
 Bayes of course also is part of stock 3.2.x. ;)  Plethora of new fraud
 rules snipped.

Karsten,
Thank you fro the suggestion of adding BRBL  and JMF.
Can you please point me to some detailed information explaining how to do that.
PS I am on a shared server without root access. ( or I would have upgraded SA)

spamc sometimes complains MISSING_MID sometimes not with same message

2010-10-09 Thread Dennis German
The question is: Has anyone seen unpredictable and different results when 
processing the same message?

The operative part of the script is:

#first run use 
echo setting aside user_prefs, running with system wide values
mv ~/.spamassassin/user_prefs  ~/.spamassassin/user_prefss
cp ~/.spamassassin/user_prefs.rptonly  ~/.spamassassin/user_prefs
grep -iv X-SPAM $1 | spamc  $1.o
grep X-Spam $1.o
grep -A14 pts rule name $1.oo|grep -v \-\-\-\-

#second run. use all MY prefs
mv -f ~/.spamassassin/user_prefss ~/.spamassassin/user_prefs
grep -iv X-SPAM $1 | spamc  $1.oo
grep X-Spam $1.oo
grep -A13 pts rule name $1.oo |grep -v \-\-\-\-



where user_prefs.rptonly  contains
add_header all report _REPORT_
add_header all testscores _TESTSSCORES(,)_

I run the script multiple times and get unpredictable results regarding the 
appearance of MISSING_MID.



Thank you,
Dennis German

Hello world, goodnight moon

Re: spamc sometimes complains MISSING_MID ... NOT...

2010-10-09 Thread Dennis German
There is at least one problem with my script, NOT spamassassin.
I did not expect the results to be in different order.
The grep -A14 'pts rule name' may not display all the errors.

Sorry 'bout that.
Dennis



spamc sometimes complains MISSING_MID sometimes not with same message

2010-10-08 Thread Dennis German
First an overview:
spamassassin 3.2.5; shared host ISP won't update spamassassin, setup is such 
that  SCORE keyword in user_prefs is ignored.
ISP will neither include  add_header all report _REPORT_   nor
add_header all testscores _TESTSSCORES(,)
++
I have a script to 
set ~/spamassassin/user_prefs to contain only:
 add_header all report _REPORT_
 add_header all testscores _TESTSSCORES(,)
take spam I received and  run spamc

then set ~/spamassassin/user_prefs to contain a large amount of SCORE entries I 
would have liked spamassassin to use,
including :
score MISSING_MID   3.7
run spamc again just to see what would have happened with my SCOREs.

This all works very nicely, usually.
++
Today I ran a particular message and the first run included:
 0.0 MISSING_MIDMissing Message-Id: header
in the report.
The second run did not mention MISSING_MID.

I reran the script and this time the first run did not mention MISSING_MID in 
the report but
the second run included
 3.7 MISSING_MIDMissing Message-Id: header
in the report.

I have added various greps to the script referencing the message as well as 
user_prefs and
run the script with unpredictable results, that is any given run may or may not 
show MISSING_MID.
I was surprised to find one run where the  
0.0 MISSING_MIDMissing Message-Id: header
in the report was the last score message as it is usually occurs after 
complaints of BLs and before HTML issues.

Has anyone seen this behavior?

Thank you,
Dennis German

Hello world, goodnight moon

Re: Expiring Bayes; aka bayes files are BIG

2010-09-15 Thread Dennis German
On Aug 26, 2010, at 10:11 AM, Grant Peel wrote:
...
 ~/.spamassassin/bayes* files had grown to 1.5 GB
 I have put:
 use_bayes 0
 bayes_auto_learn0
 bayes_auto_expire   1
 bayes_expiry_max_db_size 5
 in the local.cf file, and restarted spamd.
 
 The database did not appear to trim, so I tried:   sa-learn -u user -D 
 --force-expire
 and the database is still 1.5 GB.
 I know I am doing something(s) incorrect, but can't figure out what.
 How do I properly trim the offending file(s)?
 Is there a command to trim all databases (sers) on the box?
 Any advice would be appreciated.   Spamassassin 3.2.5,  FreeBSD 8.0
 -Grant 
 
I believe that  bayes_seen is a perl hash and will not be reduced in size by 
deleting entries.
The only way to reduce it's size is to have a program read the current file, 
entry by entry and
output to a new file. This will not copy deleted entries and the output will be 
significantly smaller.
I don't know of any program, but if there is interest I might write one.
Dennis German



Re: Expiring Bayes; aka bayes files stay BIG

2010-09-15 Thread Dennis German
On Sep 15, 2010, at 1:42 PM, RW wrote:

 On Wed, 15 Sep 2010 11:18:20 -0400
 Dennis German dger...@real-world-systems.com wrote:
 
 On Aug 26, 2010, at 10:11 AM, Grant Peel wrote:
 ...
 ~/.spamassassin/bayes* files had grown to 1.5 GB
 I have put:
 use_bayes 0
 bayes_auto_learn0
 bayes_auto_expire   1
 bayes_expiry_max_db_size 5
 in the local.cf file, and restarted spamd.
 
 The database did not appear to trim, so I tried:   sa-learn -u
 user -D --force-expire and the database is still 1.5 GB.
 I know I am doing something(s) incorrect, but can't figure out what.
 How do I properly trim the offending file(s)?
 Is there a command to trim all databases (sers) on the box?
 Any advice would be appreciated.   Spamassassin 3.2.5,  FreeBSD 8.0
 -Grant 
 
 I believe that  bayes_seen is a perl hash and will not be reduced in
 size by deleting entries. The only way to reduce it's size is to have
 a program read the current file, entry by entry and output to a new
 file. This will not copy deleted entries and the output will be
 significantly smaller. ...
  Dennis German
 
 It's straightforward to do it with backup and restore, but the problem
 is that that there is no time field. You might just as well delete
 the file periodically.  

Thanks for the info however after running backup  restore:
Before:
41,619,456 Sep 15 19:04 bayes_seen
2,543,616 Sep 15 19:04 bayes_toks 
After:
43,511,808 Sep 15 19:26 bayes_seen
 2,560,000 Sep 15 19:26 bayes_toks



spam caught, now how to catch spammer

2010-09-05 Thread Dennis German
In the last several weeks I have been receiving a lot of spam with email 
addresses of the form:

learningmadeeasy.???...@??.yourseemlost.net
learningmadeeasy.???...@??.hisoftenusing.net
learningmadeeasy.???...@??.wheatdrinkcontrol.net
learningmadeeasy....@??.actbookfelt.net
learningmadeeasy....@??.stillstationwhether.net
learningmadeeasy....@??.legbottleloss.net

and 
accountingeducation.gpx...@oiteew.badpeoplepaper.net 
 accountingeducation.ihd...@aapufx.stillstationwhether
 accountingeducation.ionm...@wxnuab.legbottleloss.net 
 accountingeducation.iqle...@mlmuwx.stillstationwhethe

and 

affordablelifeinsurance.aj...@wiogif.constum.net 
affordablelifeinsurance.ki...@pzodkk.injecou.net 

How do we stop this guy?



AWL demoted??

2010-08-10 Thread Dennis German
On Jul 22, 2010, at 10:47 AM, Michael Scheidell wrote:...
due to performance vs accuracy issues, AWL was demoted in SA 3.3x.

Can you please define demoted.

My ISP MidPhase.com,  part of uk2group.com,  uses cpanel.net (used by many 
ISPs) 
which seems to be stuck on SpamAssassin 3.2.4 (2008-01-01)

I request they upgrade last year and they weren't interested. 
I request this last week and they are still evaluating it.

Thank you,

Dennis German

Re: Auto Learn Spam

2010-04-28 Thread Dennis B. Hopp

On Wed, 2010-04-28 at 11:53 -0400, Carlos Mennens wrote:
 I noticed when reviewing headers today that there was a section for
 'autolearn=no' and was wondering what exactly does this mean and
 wouldn't autolearn be a good thing? I use Amavisd-new which calls out
 to SpamAssassin modules but I don't have the spamd daemon running
 physically. The Amavisd-new daemon simply loads the modules for spamd
 and does the scoring directly saving my mail server from running more
 daemon's and system resources that it needs to. So below are the
 headers:
 

Autolearn kicks in at certain scores.  I believe the default is 12.0 for
spam and 0.1 for ham.  You can customize those settings in your local.cf
file.

bayes_auto_learn 1
bayes_auto_learn_threshold_nonspam -3.0
bayes_auto_learn_threshold_spam 12.0

I changed the default value for nonspam because the majority of my users
don't train bayes and so the default value could cause bayes to learn
incorrectly if a spam message scored low (maybe no network rules or URI
rules triggered the first few times).

 X-Spam-Status: No, score=2.808 tagged_above=-999 required=5
 tests=[BAYES_50=0.8, HTML_IMAGE_ONLY_24=1.618, HTML_MESSAGE=0.001,
 HTML_MIME_NO_HTML_TAG=0.377, MIME_HTML_ONLY=0.723,
 RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01]
 autolearn=no
 

This particular message scored a 2.808 so it's not high or low enough
for bayes to know which way it should learn the message.

--Dennis



Re: Auto Learn Spam

2010-04-28 Thread Dennis B. Hopp

On Wed, 2010-04-28 at 12:38 -0400, Carlos Mennens wrote:

 I checked /etc/mail/spamassassin/local.cf just now and found only the 
 following:
 
 required_hits 5
 report_safe 0
 rewrite_header Subject [SPAM]
 
 However I don't know if Amavisd-new is looking at local.cf because I
 show parameters in my amavisd.conf file for SpamAssassin:
 
 $sa_tag_level_deflt  = -999.0;  # add spam info headers if at, or
 above that level
 $sa_tag2_level_deflt = 5.0; # add 'spam detected' headers at that level
 $sa_kill_level_deflt = 8.0; # triggers spam evasive actions (e.g.
 blocks mail)
 $sa_dsn_cutoff_level = 10;  # spam level beyond which a DSN is not sent
 $sa_quarantine_cutoff_level = 12; # spam level beyond which quarantine is off
 $penpals_bonus_score = 8;# (no effect without a @storage_sql_dsn database)
 $penpals_threshold_high = $sa_kill_level_deflt;  # don't waste time on hi spam
 

These settings are for amavisd-new and not spamassassin.  Amavisd-new is
the glue between your MTA and spamassassin (and virus scanners).  Most
of the behavior of spamassassin is still controlled through the local.cf
(although some settings can be defined in both places and the
amavisd.conf file will take precedence).

 $sa_mail_body_size_limit = 400*1024; # don't waste time on SA if mail is 
 larger
 $sa_local_tests_only = 0;# only tests which do not require internet 
 access?
 [...]
 $sa_spam_subject_tag = '***SPAM*** ';
 $defang_virus  = 1;  # MIME-wrap passed infected mail
 $defang_banned = 1;  # MIME-wrap passed mail containing banned name
 # for defanging bad headers only turn on certain minor contents categories:
 $defang_by_ccat{+CC_BADH.,3} = 1;  # NUL or CR character in header
 $defang_by_ccat{+CC_BADH.,5} = 1;  # header line longer than 998 characters
 
 When I get a spam message that was scored by SA, it says ***SPAM***
 and not [SPAM] so that leaves me to believe that SA parameters are
 being fed from amavisd.conf file. Does this make sense to you guys?

This is just the setting in amavisd.conf taking precedence.  If you were
to comment out $sa_spam_subject_tag I *believe* the value in your
local.cf would then be used.




Re: multiple instances

2010-04-16 Thread Dennis B. Hopp

On Fri, 2010-04-16 at 10:08 -0700, Gary Smith wrote:
 I have a need to run several different instances of SA on a single box (in 
 development).  In production, we have 3  different SA environments (with 2+ 
 servers each) that have different rule sets and specific routing rules 
 determine which instance it gets sent to.   We need to mimic this in 
 development.  
 
 Ideally I would like to create all 3 instances (*2 mimicing load balancing) 
 on a single development box.  We're not worried about the performance or 
 memory aspect.
 
 Is this possible, and if so, is there an easy way to do this.   I was 
 thinking that I could create separate chroot environments for each one if 
 necessary and either bind each instance to an IP (which I'm not sure if 
 that's possible) or at least a different port.
 
 Any advice (or some sample scripts on doing this) would be greatly 
 appreciated.
 

I'm sure it's possible, but rather than going through all the work of
trying to script and setup chroot environments, why not use VMs?  You
can then quite literally match the production setup.

Since you are not worried about performance or memory you could give
each VM 128 MB of RAM and only be using 1 GB or so total...

--Dennis



Re: Quarantine Management

2010-04-10 Thread Dennis B. Hopp

Quoting Alex mysqlstud...@gmail.com:


Hi,

Just wondering what other tools are out there that people like.

I use postfix as my MTA right now, but am not completely opposed to using
something else if necessary to use a specific quarantine system.


Amavisd-new works well with postfix


maia mailguard using amavisd-new but an old version.



I think he's probably referring to something that would help him
manage the quarantine itself, such as to query it for FNs, provide
some type of reporting, forward FPs back to the proper recipient,
manage expiry, expunging, and scoring, etc?


Yes exactly what I'm referring to.  Wishlist would be:

User controllable (i.e users can release spam messages back into their  
mailbox)

Whitelist/blacklist management
Domain configurations

maia mailguard has pretty much all of that but hasn't been updated in  
a while, just looking for other possibilities.


Do people just flag the message as spam (maybe in the header) and then  
let users filter to a spam folder?  We are using this as a front end  
to exchange so I guess we could just flag it and then have exchange  
deliver it to the users Junk E-mail folder, but then bayes can't  
learn from its mistakes as easily.


--Dennis


AWL

2010-04-09 Thread Dennis B. Hopp
I have AWL enabled and it seems to be ok with helping out legitimate
senders that occasionally send a spammy type message, but lately I
have seen an increase where AWL is adding a negative score to a very
blatant spam.  

So my questions are, do people feel AWL is worth having enabled?  

Is there a way to have the AWL rule only triggered if there is a minimum
number of messages seen by that sender?

--Dennis



Re: AWL

2010-04-09 Thread Dennis B. Hopp

 Not that I'm aware of.
 
 Is the AWL score enough to prevent the messages from being marked as
 spam, or are you seeing the negative AWL score on messages that are
 marked as spam?  It is normal for AWL to give negative scores to spam
 from time to time, but for the most part, it should not be enough to
 push the score below the spam threshold.

Not usually, but I have seen a few messages that triggered BAYES_99 or
BAYES_95 and then a few other rules that pushed the score to just above
5.0 (which is what I block at) and then AWL will come in with say a
-0.35 and drop the overall score to 4.8.

I know how AWL works and occasionally it will lower the score of a spam,
but it just seems to be happening more often lately.  I store my AWL in
mysql so I just deleted all entries that have a count of less then 20.
I think pretty much every time this happens the AWL count is low (maybe
3 or 4). 

--Dennis



KHOP_RCVD_TRUST

2010-03-26 Thread Dennis B. Hopp
I received the following e-mail

http://pastebin.com/JXr9buxi

It had a total score of 4.973 (blocked at 5).  Among other rules it hit:

KHOP_RCVD_TRUST=-1.75,RCVD_IN_DNSWL_MED=-0.5,SPF_PASS=-0.001

So is the KHOP_RCVD_TRUST score too low?  Should I possibly consider
making that -0.75 or something?  Is there a way to report FP to KHOP?

Thanks,

--Dennis




Re: KHOP_RCVD_TRUST

2010-03-26 Thread Dennis B. Hopp

On Fri, 2010-03-26 at 11:35 -0400, Michael Scheidell wrote:
 
 On 3/26/10 10:41 AM, Dennis B. Hopp wrote:
  I received the following e-mail
 
  http://pastebin.com/JXr9buxi
 
  It had a total score of 4.973 (blocked at 5).  Among other rules it hit:
 
  KHOP_RCVD_TRUST=-1.75,RCVD_IN_DNSWL_MED=-0.5,SPF_PASS=-0.001
 
 
 is that an old rule? i just checked SA updates, and I don't see that 
 rule in current SA 3.3.1
 
 so, who is KHOP?  I looked in rule sets and don't know them.  were these 
 rules inherited form some outside trusted source?
 
 

http://khopesh.com/wiki/Anti-spam#sa-update_channels

Some of his rules I believe have been incorporated into mainline sa.
I'm using 3.3.1.  I just got an update from some of the KHOP channels
yesterday so they appeared to be maintained.

--Dennis



Re: Upgrading to SpamAssassin 3.3

2010-03-17 Thread Dennis B. Hopp

On Wed, 2010-03-17 at 11:35 -0400, Kaleb Hosie wrote:
 Hello,
 I'm running SA 3.2.5 on CentOS 5.4 and I've noticed that a newer major 
 release has been released. The server is currently in production so I'm a bit 
 leery to upgrade.
 
 Do you feel that it is worth the upgrade to 3.3? Is there anything I should 
 know before I go ahead and upgrade?
 

I upgraded CentOS 5.4 to 3.3.0 and only ran into one issue which had
nothing to do with spamassassin.  The ugprade of spamassassin went fine
but I use it with maia-mailguard and the current stable version of
maia-mailguard does not work correctly with 3.3.0.  There is a patch in
the svn for maia that fixes the issue.

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-12 Thread Dennis B. Hopp

 describe FORGED_HOTMAIL   Hotmail with non-Hotmail Reply-to address
 header   __FORGED_HM1 From ~= /\...@hotmail\.com/i
 header   __FORGED_HM2 Reply-to ~= /\...@hotmail\.com/i
 meta FORGED_HOTMAIL   (__FORGED_HM1  !__FORGED_HM2)
 scoreFORGED_HOTMAIL   5.0
 
 and write cookie cutter rules for Yahoo and Gmail. 
 
 OTOH if you're happy that a Japanese test won't generate FPs you can
 cover all three ISPs with one rule:  
 
 describe FORGED_FROM Hotmail,Yahoo or Google with Japanese Reply-to 
 header   __FF1   From ~= /\@(hotmail|yahoo|gmail)\.com/i
 header   __FF2   Reply-to ~= /\.jp/i
 meta FORGED_FROM (__FF1  __FF2)
 scoreFORGED_FROM 5.0
 
 Of course, if its just a few Japanese ISPs being used you can easily
 make _FF2 more specific.
 

I tried this for yahoo...

describe FORGED_YAHOO Yahoo with non-Yahoo Reply-to address
header   __FORGED_YH1 From =~ /\...@yahoo\.com/i
header   __FORGED_YH2 Reply-to =~ /\...@yahoo\.com/i
meta FORGED_YAHOO (__FORGED_YH1  !__FORGED_YH2)
scoreFORGED_YAHOO 0.25

And it triggered on a message with the following header

http://pastebin.com/qs18DpYn

My best guess is it is using the In-Reply-To header...is there a way
to differentiate In-Reply-To and Reply-To ?

Thanks,

--Dennis



Re: [sa] Re: Bogus mails from hijacked accounts

2010-03-12 Thread Dennis B. Hopp


 The problem with this is that the !__FORGED_YH2 matches
 when there is *NO* Reply-To header at all!
 
 You need something like this:
 
 header __FORGED_YH2 Reply-To =~ /\@([^y]|y[^a]|ya[^h]|yah[^o])/i
 meta FORGED_YAHOO (__FORGED_YH1  __FORGED_YH2)
 
 (remove the negation from the meta)
 This directly tests for an existing Reply-To specifically to a domain
 that does not begin with 'yaho'.

Wouldn't that meta rule trigger when the reply-to contained 'yaho'?  I
want to trigger when the from contains yahoo.com and the reply-to does
not.

 
 However, keep in mind that the headers for *this* mailing list would 
 trigger your rule. So you will also need to meta this with a rule that 
 tests for yahoo mail server being the sending SMTP client
 

Good point.  I didn't think about that..

--Dennis



Re: [sa] Re: Bogus mails from hijacked accounts

2010-03-12 Thread Dennis B. Hopp

On Fri, 2010-03-12 at 12:52 -0600, Dennis B. Hopp wrote:
 
  The problem with this is that the !__FORGED_YH2 matches
  when there is *NO* Reply-To header at all!
  
  You need something like this:
  
  header __FORGED_YH2 Reply-To =~ /\@([^y]|y[^a]|ya[^h]|yah[^o])/i
  meta FORGED_YAHOO (__FORGED_YH1  __FORGED_YH2)
  
  (remove the negation from the meta)
  This directly tests for an existing Reply-To specifically to a domain
  that does not begin with 'yaho'.
 
 Wouldn't that meta rule trigger when the reply-to contained 'yaho'?  I
 want to trigger when the from contains yahoo.com and the reply-to does
 not.

Nevermind..the '^' inside brackets negates..I get it now..



Re: My First Spam Mail Today

2010-03-12 Thread Dennis B. Hopp

 My headers look like:
 
 X-Spam-Checker-Version: SpamAssassin 3.3.0 (2010-01-18) on mail.iamghost.com
 X-Spam-Level: *
 X-Spam-Status: No, score=1.0 required=6.3
  tests=EXTRA_MPART_TYPE,HTML_MESSAGE autolearn=no version=3.3.0
 
 *
 

The message scored a 1.0 (score=1.0) but the X-Spam-Score header
apparently wasn't added to the message.

 The above snipper shows no score as I would expect to see below from a
 different server:
 
 X-Spam-Flag: NO
 X-Spam-Score: -1.15
 X-Spam-Level:
 X-Spam-Status: No, score=-1.15 tagged_above=-999 required=5
 tests=[BAYES_00=-2.599, MSGID_MULTIPLE_AT=1.449] autolearn=no
 
 *
 
 Am I missing something in my local.cf that is not properly scoring all
 incoming messages?

In this example you also have tagged_above=-999 which leads me to
believe you are using amavisd-new.  Are both servers using
amavisd-new?  

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 1)  Spammers rotate sender addresses and hijacked account info more 
 often than most of us change our underwear.  An account *may* get 
 reused;  chances are it'll be months before it does, and the spammers 
 will have rotated through hundreds or thousands of others - both 
 phish-cracked and those set up just to send their junk.  Blacklisting a 
 sender is reduced to blocking the persistent friend-of-a-friend who 
 refuses to remove you from the endless stream of chain-forwards, and 
 legitimate-but-totally-clueless mailing list operators who can't figure 
 out how to unsubscribe you from their list.  :(
 
 2)  You noted originally that these appear to be fully legitimate 
 freemail accounts, legitimately used in the past to correspond with your 
 customers/clients, that have been compromised and then used to send 
 spam.  How do you propose to still allow the legitimate account holders 
 to email your clients if you blacklist the sender?
 

I don't want to blacklist the address, hence the reason why in my
original e-mail I said other then blacklisting.  I know blacklisting
would block these bogus e-mails as well as legit e-mails as soon as the
clients get access back (they currently don't have access to their
accounts because their passwords have been changed).  


 
 Martin's suggestion followup should point you in the right direction. 
 Sets of phrase rules (how similar are these messages?  do you have ten 
 or fifteen you can compare sentence-by-sentence?) with low scores will 
 likely help some too.  Meta rules that bump the score up depending on 
 how many phrases hit, or phrase+mismatched-sender/reply also work 
 tolerably well on this class of spam... if you can get enough samples to 
 build a complete enough set of phrase rules.

I'm going to look at what Martin suggested and compare it to what
samples I have.

Thanks,

--Dennis




Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 Its not conditional, just using a meta rule and negating the Reply-to
 test in the meta:
 
 describe FORGED_HOTMAIL   Hotmail with non-Hotmail Reply-to address
 header   __FORGED_HM1 From ~= /\...@hotmail\.com/i
 header   __FORGED_HM2 Reply-to ~= /\...@hotmail\.com/i
 meta FORGED_HOTMAIL   (__FORGED_HM1  !__FORGED_HM2)
 scoreFORGED_HOTMAIL   5.0
 
 and write cookie cutter rules for Yahoo and Gmail. 
 
 OTOH if you're happy that a Japanese test won't generate FPs you can
 cover all three ISPs with one rule:  
 
 describe FORGED_FROM Hotmail,Yahoo or Google with Japanese Reply-to 
 header   __FF1   From ~= /\@(hotmail|yahoo|gmail)\.com/i
 header   __FF2   Reply-to ~= /\.jp/i
 meta FORGED_FROM (__FF1  __FF2)
 scoreFORGED_FROM 5.0

Thanks Martin.  This is actually far simpler then I was thinking it
would be.

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 I don't think the accounts were hijacked: the headers showed that the
 messages the OP posted were not sent from the domain hosting the mail
 accounts. It looked to me as if somebody has sold on lists of valid
 hotmail etc. accounts.
 
 I smell an inside job, or at least some careful preparation, because the
 OP reckons that these accounts (forged as sender) were paired with valid
 accounts he hosts that would be used by the owner of the forged account.
 The messages I saw took the form:

We got one owner of the hijacked accounts to admit he got an e-mail that
basically said Hi we are trying to get rid of dead accounts so please
click here to verify your information.  The site then very nicely asked
for his username/password which he gave and then viola, no more access
to his account.  The message was then sent to every address in his
address book (which is why many of my users got the same message). 

Sadly, we have had this happen a couple of times with hotmail and yahoo 
addresses.

What can I say, some of our clients aren't exactly the most tech savvy.

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-11 Thread Dennis B. Hopp

 ...and I suppose the same would apply to social networks. I don't use
 either, so am somewhat clueless about what goodies are available if you
 can access their accounts.
 

I have some free e-mail accounts that I use as throw away accounts.
When a site just HAS to have a valid e-mail so you can read the news
article or whatever.  I might login to the accounts about once a month.

   The one of these I encountered at $DAYJOB was sent to the account
  owner's wife's ex-husband-- not my first choice when asking for emergency
  funds. The email also claimed he was traveling in London-- the guy AFAIK
  hasn't left Texas, let alone the US, in the past few years-- and used a
  number of phrases that a native speaker of American so-called-English
  wouldn't.
 
 OK, looks like I hugely overestimated the intelligence of recipients of
 such scams and hence the care needed to target an attack.
 

It's a sad thing, but a lot of people fall for stupid scams every day...



Bogus mails from hijacked accounts

2010-03-10 Thread Dennis B. Hopp
We seem to be having a problem where clients that we interact with
regularly are having their hotmail/gmail/yahoo accounts hijacked.  We
are receiving e-mails from their accounts that legitimately go through
the correct servers (hotmail,yahoo, etc.) and so they get passed through
our spam filters.  The messages have different bodies but basically say
the same thing that they were on vacation and had all their money stolen
so they need to have money wire transferred to them.

Obviously we just have to tell the clients that they need to deal with
the various e-mail providers, but is there an effective way that I can
filter these messages out before my users see them without blacklisting
the address?  In one case I had probably 15 users that received the same
message and naturally they freaked out.

I have put a sample at:

http://pastebin.com/9BDXrxmm

Note I did change the real e-mail address in this message but the
hotmail address used is valid just masked.

The message doesn't hit any rules of significance on my system.

BAYES_00=-1.9,FREEMAIL_FROM=0.001,HTML_MESSAGE=0.001,RCVD_IN_DNSWL_NONE=-0.0001,SPF_PASS=-0.001,T_RP_MATCHES_RCVD=-0.01,T_TO_NO_BRKTS_FREEMAIL=0.01


Thanks

--Dennis



Re: Bogus mails from hijacked accounts

2010-03-10 Thread Dennis B. Hopp

On Wed, 2010-03-10 at 20:22 +, Martin Gregorie wrote:
 On Wed, 2010-03-10 at 13:37 -0600, Dennis B. Hopp wrote:
 
  Obviously we just have to tell the clients that they need to deal with
  the various e-mail providers, but is there an effective way that I can
  filter these messages out before my users see them without blacklisting
  the address?
 
 There's nothing in SA that can blacklist a sending MTA, so blacklisting
 can't happen unless you've added something to your MTA set-up that does
 auto-blacklisting.
 

I meant blacklisting the sender address, not the MTA.

 The question then comes down to marking the message as spam and dealing
 with it however you normally deal with spam. You'll probably need custom
 rule(s) to handle that. You say the message bodies are quite variable,
 but I notice that the Reply-to: header doesn't remotely match the From:
 header. Is this a common factor?
 

The ones that I have seen the reply-to doesn't match the from and I
think the reply-to have all been something.jp

 If it is, and the body texts have no common features that could also be
 used, the only obvious approach would be a rule for each forged sending
 domain that fires if the sending domain doesn't match the Reply-to
 domain. 
 

There isn't anything in common that I can see that wouldn't be
susceptible to false positives.  One even left the clients signature
intact.  I've written fairly simple custom rules before but I'm not sure
how to do conditional rules.  I'll have to dig into the docs a little
more.
 
 Only you can know if these rules would cause false positives: I can't
 possibly tell from a single sample message.
 

I wasn't expecting anybody to give me a magic rule that would fix it,
just suggestions since I would only be able to blacklist the sender
address after the e-mail had been received and I was notified of the
problem.  And obviously blacklisting all of gmail/hotmail/yahoo isn't an
option.

Thanks,

--Dennis



Re: Bogus Dollar Amounts

2010-02-25 Thread Dennis B. Hopp

Quoting Kai Schaetzl mailli...@conactive.com:


Dennis B. Hopp wrote on Wed, 24 Feb 2010 09:14:58 -0600:

Obviously I have something going on with my bayes, but that's a   
separate issue


Indeed. But it's an important issue. If it is that biased for other   
spam as well

youa re better off to not use it in this state.

X-Spam-Status: No, score=2.8 required=5.0 tests=BAYES_50,HK_MUCHMONEY,
T_LOTS_OF_MONEY,UNPARSEABLE_RELAY autolearn=no version=3.3.0

add your RBL score and it's way over 5.



I agree it's an important issue.  I had turned off bayes autoexpire in  
local.cf and at some point taken the cron job out that did a manual  
force-expire.  Once I did a force expire BAYES_60 triggered rather  
then BAYES_00.


What is the HK_MUCHMONEY rule that you have?  Is that part of the base  
SA installation?


Thanks,

--Dennis


Bogus Dollar Amounts

2010-02-24 Thread Dennis B. Hopp
I have been seeing a few spam mails slip past that talk about being  
able to get bogus dollar amounts.  What I mean by that is it will give  
a large value in the e-mail but where there should be a comma it puts  
a period.


I put an example of one of these messages at:

http://pastebin.com/SXuGELUS

Are there any rules that can detect this?  The only rules this hit on  
mine are:


1.900   DCC_CHECK
1.449   RCVD_IN_BRBL_LASTEXT
1.000   RCVD_IN_BRBL
-0.001  SPF_PASS
-0.010  T_RP_MATCHES_RCVD
-1.900  BAYES_00

Obviously I have something going on with my bayes, but that's a separate issue

Thanks,

--Dennis


Re: Bogus Dollar Amounts

2010-02-24 Thread Dennis B. Hopp

Nevermind...it was also hitting

T_LOTS_OF_MONEY

and once I expired old bayes tokens it no longer hit BAYES_00.  Now I  
just have to figure out whats up with my bayes db.


--Dennis

Quoting Dennis B. Hopp dh...@coreps.com:


I have been seeing a few spam mails slip past that talk about being
able to get bogus dollar amounts.  What I mean by that is it will give
a large value in the e-mail but where there should be a comma it puts a
period.

I put an example of one of these messages at:

http://pastebin.com/SXuGELUS

Are there any rules that can detect this?  The only rules this hit on
mine are:

1.900   DCC_CHECK
1.449   RCVD_IN_BRBL_LASTEXT
1.000   RCVD_IN_BRBL
-0.001  SPF_PASS
-0.010  T_RP_MATCHES_RCVD
-1.900  BAYES_00

Obviously I have something going on with my bayes, but that's a   
separate issue


Thanks,

--Dennis





Re: Bogus Dollar Amounts

2010-02-24 Thread Dennis B. Hopp



It is common in many parts of the world to use a period instead of a
comma as a digit group separator, and vice-versa for the decimal
separator.

http://en.wikipedia.org/wiki/Thousands_separator#Digit_grouping



I knew it was common in other parts of the world, but for some reason  
was thinking that when referring to US Dollars it wouldn't be.  Now  
that I think about it I can understand why my original thought was  
wrong.


I guess it doesn't really matter since the message was actually  
hitting another rule (T_LOTS_OF_MONEY) that I somehow missed.


--Dennis



Re: SA: lottery message scored hammy by bayes

2009-08-27 Thread Dennis German

Apparently I am not sure if bayes is autolearning
I am on a shared host service  (midphase)
which uses cPanel and has exim do the spamassassin stuff.
They use my scores but ignore other commands.
When I get a message I think I shouldn't have I
save it and run spamc  m   .out inorder to
see the X-Spam-report (which is Not included in ham !)

My userprefs is always available at
http:/www.Real-World-Systems.com/mail/user_prefs.html


I have not manually trained bayes.
Thanks



John Hardin wrote:

On Tue, 25 Aug 2009, Dennis German wrote:


email with this content:

CONGRATULATION ...

received these scores

X-Spam-testscores: 
BAYES_00=-2.599,HTML_MESSAGE=0.001,MISSING_HEADERS=5.7,

   SUBJ_ALL_CAPS=3.1,UPPERCASE_75_100=1.528

Does this indicate that bayes needs tuning/learning?


Can you paste the output from sa-learn --dump magic ?

It probably indicates that Bayes has been mistrained - somebody is 
training spammy messages as ham.


How do you do your Bayes training? Autolearning, or purely manual, or 
some combination?


How many messages are getting inappropriate Bayes scores? If a lot are, 
you'll probably want to turn off autolearning (if you're using it) until 
you analyze the problem. You may need to wipe your Bayes database and 
start fresh if the problem is bad enough.


If you're using autolearning, what are your learning thresholds?

If you're manually training, do you keep your corpora so that you can 
review and correct errors? If so, review your ham corpora and see if any 
spams have crept in - and if so, retrain them as spam, SA will forget 
that they were hammy.




lottery message scored hammy by bayes

2009-08-25 Thread Dennis German

email with this content:

CONGRATULATION YOUR EMAIL ADDRESS HAS WON YOU THE 2010 FIFA WORLDCUP LOTTER=
Y OPEN THE ATTACHMENT AND VIEW THE PROFILE OF YOUR WINNING FUND=2C ALSO CON=
TACT YOUR CLAIM AGENT

received these scores

X-Spam-testscores: BAYES_00=-2.599,HTML_MESSAGE=0.001,MISSING_HEADERS=5.7,
   SUBJ_ALL_CAPS=3.1,UPPERCASE_75_100=1.528

Does this indicate that bayes needs tuning/learning?

Thank you



sa: lottery message scored hammy by bayes:salearn --dump magin

2009-08-25 Thread Dennis German

sa-learn --dump magic
config: could not find site rules directory
0.000  0  3  0  non-token data: bayes db version
0.000  0 262297  0  non-token data: nspam
0.000  0  24621  0  non-token data: nham
0.000  0 142776  0  non-token data: ntokens
0.000  0 1246871454  0  non-token data: oldest atime
0.000  0 1251249448  0  non-token data: newest atime
0.000  0 1251218718  0  non-token data: last journal 
sync atime

0.000  0 1249634620  0  non-token data: last expiry atime
0.000  02764800  0  non-token data: last expire 
atime delta
0.000  0  65002  0  non-token data: last expire 
reduction count


Re: mail slipping through

2009-08-19 Thread Dennis B. Hopp

Quoting Gary Smith gary.sm...@holdstead.com:

I've been having a pretty good hit rate on spam until recently   
(about two weeks).  Two types of email have been coming through at a  
 good rate.  I'm receiving at least four per hour from the domains   
included below.  I've also been training bayes with them as well, to  
 no avail.


Is it pretty much the same body, just different senders?



*...@chocolatebearbear .INFO
*...@biblegame .info
*...@clickbetterthere .info



If it's just the senders you could easily blacklist the domains, none  
of these domains look all that legit.


Can you copy a message or two (with full headers) to pastebin so we  
can have a look?


--Dennis


Re: blacklisting a forger; summary; /* end

2009-08-03 Thread Dennis G German
Summary:

 

Problem:

Observing scatter from many different sites coming to vari...@mydomain.com
. 

 

These are NDRs (Non delivery Responses) to messages sent from

the forger or infected system :

59.184.51.13 aka triband-mum-59.184.51.13.mtnl.net.in

Is already blacklisted on many Realtime Black Lists as seen via

 http://www.mxtoolbox.com/blacklists.aspx

 

The various sites that are sending NDRs should be checking one of 

The RBLs and ignoring the initial email.

 

My email is configured to accept all vari...@mydomain.com so it 

does not contribute to network traffic by sending NDRs.

 

First forwarder: relay1.sea.eschelon.com (66.213.193.108)  shold

 

Thank to all for comments and suggestions

 



Backscatter.org used as RBL??

2009-08-03 Thread Dennis G German
Is Backscatter.org http://www.backscatterer.org/index.php  used by any
rules?

 

I looked but did not find any.

Dennis G German



blacklisting a forger

2009-08-01 Thread Dennis German

I have received  many emails in the last hour which were undeliverable,
NOT sent by me.
It seems someone is forging usernames in my domain Real-World-Systems.com
as the from: and the return-path: .

Received-From-MTA: dns;triband-mum-59.184.51.13.mtnl.net.in


I have sent a message to ab...@mntl.net.in and helpd...@mtnl.net.in but 
no response.


How does an MTA get blacklisted??




Re: Cant Post Message

2009-07-31 Thread Dennis B. Hopp

Quoting twofers twof...@yahoo.com:

I have a post I have tried several times over the last week to post   
to this forum and it never seems to get posted. I don't understand   
why?

 
There is nothing exotic about it, just text, a question and email   
header info I pasted.

 
Any idea whats up?
 
Thanks,
 
Wes





Try putting the header on a site like www.pastebin.com and then put  
the link in your e-mail rather then the actual header.


--Dennis


Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting LuKreme krem...@kreme.com:


On Jul 30, 2009, at 18:12, Dennis B. Hopp dh...@coreps.com wrote:
Yeah I knew that.  I have a few negative scoring rules but not many  
 (outside of what might be in the misc rules sets I have).  What is  
 a good threshold for ham then?


5.0 is the score SA us designed for. It's a very good number in almost
all cases.


I meant the threshold for bayes auto learn to learn the message.  I'll  
try switching back to the default values.


Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting RW rwmailli...@googlemail.com:


On Fri, 31 Jul 2009 03:55:48 +0200
Karsten Bräckelmann guent...@rudersport.de wrote:



The default of 0.1. It's a default for a reason.

But that *really* is not your problem. Your problem is with learning
spam, not learning even more ham. Just as you mentioned in your
original report. See my previous response for a solution. You want to
learn more spam.


What he actually wrote was that 3.7% of _all_messages_ were hitting
hitting BAYES_00, and 1.7% were hitting BAYES_99.

If he actually meant what he wrote and doesn't have an extraordinary
spam/ham ratio, then he clearly has a problem with both spam and
ham.



I cleared my maia statistics a couple of days ago.  Since then  
BAYES_00 has triggered 4510 times, BAYES_99 2366 times and BAYES_50  
1568 (all the other BAYES_XX are less then 1000 times).  In those same  
couple of days we have processed about 45,000 messages (this is the  
number of messages that actually reached spamassasin and wasn't out  
right rejected).  So my initial percentages were way off (I was going  
by maia mailguards sa rule statistics).  So roughly 10% of mail is  
hitting BAYES_00 and 5% is hitting BAYES_99.  It seems to me that  
BAYES_99 should probably be triggered more often then BAYES_00.


If there is a better way to get sa statistics I'd be happy to know.

I know that the bayes success rate comes down to training, but like  
every other administrator I can't possible check every message for  
accuracy and I was hoping to make the auto learn a little better.  I  
thought maybe I just didn't have enough rules (both negative and  
positive scoring) to trigger the auto learn often enough.


Thanks,

--Dennis



Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting John Hardin jhar...@impsec.org:


On Fri, 31 Jul 2009, Dennis B. Hopp wrote:

I cleared my maia statistics a couple of days ago.  Since then   
BAYES_00 has triggered 4510 times, BAYES_99 2366 times and BAYES_50  
 1568 (all the other BAYES_XX are less then 1000 times).


Do they all add up to about 45,000?



No they don't.  I see some messages that trigger no rules at all  
(Bayes or otherwise).  I thought that was odd since I thought a bayes  
rule should trigger pretty much all the time.


In those same couple of days we have processed about 45,000   
messages (this is the number of messages that actually reached   
spamassasin and wasn't out right rejected).


If there is a better way to get sa statistics I'd be happy to know.


sa_stats.pl from the SARE website.

http://www.rulesemporium.com/programs/


I'll take a look.  Will this works with logs that are written by amavisd-new?

Thanks,

--Dennis



Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting Karsten Bräckelmann guent...@rudersport.de:


On Fri, 2009-07-31 at 06:07 -0700, John Hardin wrote:

On Fri, 31 Jul 2009, Dennis B. Hopp wrote:

 I cleared my maia statistics a couple of days ago.  Since then
BAYES_00 has

 triggered 4510 times, BAYES_99 2366 times and BAYES_50 1568 (all the other
 BAYES_XX are less then 1000 times).

Do they all add up to about 45,000?


Doh!  Good catch, John.

No, they cannot possibly.  Do the math. These 3 rules are less than 10k,
remaining 35k. Each less than 1k hits means we need another  35 rules.
However, there are merely 6 ones left.

  $ grep -c BAYES_ 50_scores.cf
  9

The stats are incorrect.  Well, unless the lions share is processed with
Bayes disabled, or otherwise not processed by SA.


I do have sanesecurity rules in clamav which may be filtering messages
before spamassassin sees them which would account for some of the
difference between the total BAYES triggered and messages received.
We also relay all outbound mail through these same servers but do not
send outbound mail through spamassassin which again would make for  
some difference.  I should have thought to mention that before.


I couldn't get sa-stats to give me any useful information.  I did get
amavis-logwatch and I am not sure if I like what it's showing me.  I ran it
against the last few maillogs I have so it encompasses basically the  
last month.  Here is the relevant parts of the output:


http://pastebin.com/m59ddaf1d

If I'm reading that correctly less then 50% of mail is actually
being filtered (seems like it should be higher then that). Those stats  
don't count the messages we completely reject.  We don't reject solely  
on one RBL but use policy-weightd to reject messages.  I guess I could  
just let all messages through to SA for a few days to see how things  
change, but I don't see the point of wasting CPU/Memory for messages  
that are pretty much guaranteed spam.


Here is the stats on my postfix:

http://pastebin.com/m15d2533e

Maybe I'm worried about nothing but given some of the spam that I get  
forwarded that gets through (some very obvious spam) and then to see  
what rules it hits just makes me think that something isn't quite right.


--Dennis




Re: Number of rules

2009-07-31 Thread Dennis B. Hopp

Quoting Karsten Bräckelmann guent...@rudersport.de:


If I'm reading that correctly less then 50% of mail is actually
being filtered (seems like it should be higher then that). Those stats


Actually, the numbers you gave for the last couple days are even
lower. About one third, 15k out of 45k do have a BAYES_xx hit and thus
are scanned by SA.

I told you how to train your Bayes, if you're not satisfied with the
result. Whether you like it not, there really isn't an other way. FWIW,
blocking the obvious offenders early seems like a proper explanation for
Bayes not showing a lot of high hitters.


Yes you did and I'm going to set something up to make a copy of the  
messages that trigger BAYES_20 through BAYES_80 into a separate  
mailbox that I can then inspect periodically for a while (while still  
letting the message be delivered to the user)




Anyway, considering the back and forth -- IMHO, you *first* should get a
clear picture how exactly your mail is being processed. I don't feel
like stabbing in the dark.



And I don't expect you to take a stab in the dark.  The 45K messages  
was the total processed inbound and outbound which I didn't think  
about that outbound is not funneled through SA and so would not be  
seen in BAYES.  So I admit, it was a poor analysis on my part.





Maybe I'm worried about nothing but given some of the spam that I get
forwarded that gets through (some very obvious spam) and then to see
what rules it hits just makes me think that something isn't quite right.


Forwarded -- as in reports by your users, or forwarded from external MXs
to yours? In the latter case, the obvious thing to check is your
internal and trusted network settings.



Forwarded from internal users asking how it got through the spam  
filters.  I rarely get reports to our abuse/postmaster addresses (with  
the exception of AOL users who mark messages as spam when they clearly  
are not spam).


Number of rules

2009-07-30 Thread Dennis B. Hopp
I'm using maia-mailguard with spamassassin 3.2.5.  For the most part  
it seems to be working ok but I feel like too many messages are  
hitting BAYES_00 (roughly 3.7% of all messages) and BAYES_99 is only  
hitting about 1.7%.  I have bayes autolearn on with ham being learned  
at -1.0 and spam learned at 8.0


I'm sort of thinking part of my problem is I just don't have enough  
rules so I'm curious how many rules do other users out there have in  
their spamassassin setup?


I currently have about 2558 rules consiting of stock rules, SOUGHT,  
KHOP, SARE, some customer rules I wrote and various rules I've seen  
posted on this list and other sites.  I have a few plugins enabled as  
well (FreeMail, iXhash, Botnet, ASN, Pyzor, Razor2, DCC)


I know some of it is just training of the bayes but I'm wondering if  
just lack of rules might be causing some of my problems.


Thanks,

--Dennis



Re: Number of rules

2009-07-30 Thread Dennis B. Hopp

Quoting RW rwmailli...@googlemail.com:



Bear in mind that autolearning uses it's own version of the score that
excludes whitelisting and Bayes, which means that very little ham will
reach the -1 threshold unless you've added your own site-specific rules
for identifying it.



Yeah I knew that.  I have a few negative scoring rules but not many  
(outside of what might be in the misc rules sets I have).  What is a  
good threshold for ham then?


--Dennis


Re: Email from myself to myself

2009-05-28 Thread Dennis German

Do you see any x-Spam headers in the emails ?

Is this on a shared server  (cPanel)?

hateSpam wrote:

I have spamassassin installed in my server but I have never had an email wht
[SPAM] in the subject. I get lots of spam. I think it is not checking
properly. 


anybody know how to solve the problem please?
  


Re: AWL - lets change the name to HEAT with ln

2009-05-28 Thread Dennis German

How 'bout a link from  HEAT   (  Heuristic Email Address Tracking )



Matus UHLAR - fantomas wrote:
On Mittwoch 27 Mai 2009 LuKreme wroteNo, you are confused. This is common, lots of people are confused  
about this. This is why many people think the name needs to be

changed   to Averaged Weight List or something similar.



On 28.05.09 12:06, Michael Monnerie wrote:
  
The name is really a mess. Even if you'd call it Averaged Weight List, 
when you read AWL then AutoWhiteList comes to your mind, right?



no.
It comes to your mind only because it was (and yet is) named that way. If
there would be no auto white list, we just would not know what AWL means.

  

It needs another TLA (three letter acronym), or maybe more, whatever. Even
AHBSASWAWB, as written my Matt Kettler, would be better than AWL.

I don't really mind, just change it. Someone. But also the AWL shortcut.



That would break any existing configuration. There's bugreport opened for
this (i commented today) and admins just SHOULD read the docs.

  




Re: SA: what do SPF_SOFTFAIL SPF_NEUTRAL mean++ThankYou

2009-05-18 Thread Dennis German

Sahil Tandon wrote:

On Sun, 17 May 2009, Dennis German wrote:

  

Could someone discuss or add a wiki page about?

SPF_SOFTFAIL



http://www.openspf.org/RFC_4408#op-result-softfail

  

SPF_NEUTRAL



http://www.openspf.org/RFC_4408#op-result-neutral

  




SA: what do SPF_SOFTFAIL SPF_NEUTRAL mean

2009-05-17 Thread Dennis German

Could someone discuss or add a wiki page about?

SPF_SOFTFAIL

SPF_NEUTRAL




Re: spamassassin block *.png

2009-05-01 Thread Dennis Davis
On Fri, 1 May 2009, vibi wrote:

 From: vibi ml...@go2.pl
 To: users@spamassassin.apache.org
 Date: Fri, 1 May 2009 02:56:34 -0700 (PDT)
 Subject: spamassassin block *.png

 How to use spamassassin block *.png so that going to the quarantine?
 100% of spam that gets to me a plain e-mail with attachment *.png

One poossible tool to help reduce this is the FuzzyOcr plugin:

http://fuzzyocr.own-hero.net/

You'll need other graphics software used by the above plugin.

For example, a message I receive a couple of days ago scored:

X-Spam-Report: 1.0/6.0
 Start SpamAssassin results 
*  1.0 DC_IMG_TEXT_RATIO BODY: Low body to pixel area ratio
 End SpamAssassin results

With the addition of the FuzzyOcr plugin it scored:

X-Spam-Status: Yes, score=12.1 required=6.0 tests=FUZZY_OCR,RDNS_NONE
autolearn=disabled version=3.2.5
X-Spam-Report: 
*  0.1 RDNS_NONE Delivered to trusted network by a host with no rDNS
*   12 FUZZY_OCR BODY: Mail contains an image with common spam text insi
de
*  [Words found:]
[viagra in 5 lines]
[profit in 1 lines]
[(9 word occurrences found)]
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
d.h.da...@bath.ac.uk   Phone: +44 1225 386101


Re: Phishing

2009-04-27 Thread Dennis Davis
On Fri, 24 Apr 2009, SM wrote:

 From: SM s...@resistor.net
 To: users@spamassassin.apache.org
 Date: Fri, 24 Apr 2009 22:03:21 -0700
 Subject: Re: Phishing

...

 There was a project from an educational institution to target
 phishing emails.  I don't recall the name of the project or
 whether the source code was released.

You might be thinking of Kochi:

http://oss.lboro.ac.uk/kochi1.html

The Google project:

http://code.google.com/p/anti-phishing-email-reply/

is also useful as it attempts to detail the compromised accounts.
Just block/quarantine email for those accounts.

...of course the phishers are now sending out form URLs to
be completed:

http://jotform.com/form/91140758246
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
d.h.da...@bath.ac.uk   Phone: +44 1225 386101


SA: TDV_ rules. T ? D? V? acronym ?

2009-04-08 Thread Dennis G German
There are a group of rules that begin with TDV_ like 

TVD_PH_SUBJ_ACCOUNTS_POST, TVD_QUAL_MEDS, TVD_RCVD_SINGLE

 

What does TDV stand for?



SA: user_prefs contains required 4.97,

2009-04-03 Thread Dennis German

I have had
required_score 3.97
 since 4/1/09 but spamassassin email says

X-Spam-Report:
   ...
  Content analysis details:   (18.4 points, 4.0 required)

also MISSING_DATE 3.0  should be 2.97 and
MISSING_MID 3.0  should be 2.97

 I had these values several days ago!

Any ideas??

ls -l /var/run/spamd
-rw-r--r-- 1  5 Apr  3 12:02 spamd.pid

Current user_prefs can always be seen at
http://www.real-world-systems.com/mail/user_prefs.html

Thanks

RE: sa-update: determining last run? Not in /var/lib/spamassassin

2009-03-30 Thread Dennis G German
   spamassassin --version
SpamAssassin version 3.2.4
  
  ls -l /var/lib/spamassassin
 drwxr-xr-x 3 4096 Oct 16 18:27 compiled/3.002004 ...


The ONLY directory under /var/lib/spamassassin 
is
compiled 

and it does not contain any .cf files,
nor do any of the subdirectories

PS
Sorry for the previous poorly worded post as I was 
thrown after finding we are using an old version!



sa-update: determining last run

2009-03-29 Thread Dennis G German
 sa-update

mkdir /etc/mail: Permission denied at /usr/bin/sa-update line 1226

 

There is no /etc/mail directory available. (I believe the /etc directory I
can view is artifical)

I cannot make a mail directory. 

I suspect this is another cPanel (shared host) problem.

 

Is there a way I can determine when sa-update was last run?

Thanks

 

sa-update -D

[19204] dbg: logger: adding facilities: all

[19204] dbg: logger: logging level is DBG

[19204] dbg: generic: SpamAssassin version 3.2.4

[19204] dbg: config: score set 0 chosen.

[19204] dbg: dns: is Net::DNS::Resolver available? yes

[19204] dbg: dns: Net::DNS version: 0.65

[19204] dbg: generic: sa-update version svn607589

. 

[19204] dbg: gpg: Searching for 'gpg'

[19204] dbg: util: current PATH is:
/home/realger1/.bin:/usr/kerberos/bin:/usr/lib/courier-imap/bin:/usr/local/b
in:/bin:/usr/bin:/usr/local/bin:/usr/X11R6/bin:/usr/libexec

[19204] dbg: util: executable for gpg was found at /usr/bin/gpg

[19204] dbg: gpg: found /usr/bin/gpg

[19204] dbg: gpg: importing default keyring to
/etc/mail/spamassassin/sa-update-keys

mkdir /etc/mail: Permission denied at /usr/bin/sa-update line 1226



spamassassin: Determining last sa-update

2009-03-29 Thread Dennis German

I believe this is another cPanel issue.
Attempting to run sa-update displays:
   mkdir /etc/mail: Permission denied at /usr/bin/sa-update line 1226

How can I determine that last time sa-update was run?



SA: Determining last sa-update

2009-03-29 Thread Dennis German

I believe this is another cPanel issue.
Attempting to run sa-update displays:
   mkdir /etc/mail: Permission denied at /usr/bin/sa-update line 1226

How can I determine that last time sa-update was run?



SA: Determining last sa-update

2009-03-29 Thread Dennis German

I believe this is another cPanel issue.
Attempting to run sa-update displays:
   mkdir /etc/mail: Permission denied at /usr/bin/sa-update line 1226

How can I determine that last time sa-update was run?



sa-update when was last run?

2009-03-29 Thread Dennis German

I believe this is another cPanel issue.
Attempting to run sa-update displays:
   mkdir /etc/mail: Permission denied at /usr/bin/sa-update line 1226

How can I determine that last time sa-update was run?



spam assassin: default scores for URIBL_.._SURBL seem low to me

2009-03-24 Thread Dennis German

It seems to me that the default score of from 1.2 to 1.9,
 for messages originating from URIs which are Black listed
in any of the various JP,  AB, OB, PH, SC, ...  lists,
should be significantly higher, perhaps nearly the default
required score of 5.0

Some information is at http://ruleqa.spamassassin.org,
including the fact that
86% of URIBL_JP_SURBL hits also hit URIBL_OB_SURBL
66% of URIBL_JP_SURBL hits also hit URIBL_WS_SURBL
56% of URIBL_JP_SURBL hits also hit URIBL_AB_SURBL
etc
Is there a discussion of the tests and scores and philosophy
including but not limited to these somewhere?
Thanks,
Dennis German


spamassasin: sa-learn --dump magic intrepretation

2009-03-16 Thread Dennis German

Is there a document regarding the interpretation of


 sa-learn --dump magic
config: could not find site rules directory

0.000  03  0  non-token data: bayes db  
version

0.000  0   261451  0  non-token data: nspam
0.000  018530  0  non-token data: nham
0.000  0   143599  0  non-token data: ntokens

0.000  0  1231533845  0  non-token data: oldest atime
0.000  0  1237223892  0  non-token data: newest atime
0.000  0  1237214668  0  non-token data: last journal  
sync atime
0.000  0  1237059740  0  non-token data: last expiry  
atime


0.000  05529600  0  non-token data: last expire  
atime delta


0.000  0   9311  0  non-token data: last expire  
reduction count




Re: spamassasin: sa-learn --dump magic interpretation good/bad/other?

2009-03-16 Thread Dennis German

0) Michael, thanks

1) what are the various  zero columns??
for example in  0.000  0  3  0  non-token data: bayes db version

2) Is this good?  not too good? bad? trouble?


On Mar 16, 2009, at 14:03, Michael Scheidell wrote:


Is there a document regarding the interpretation of


 sa-learn --dump magic
config: could not find site rules directory

0.000  03  0  non-token data: bayes db  
version

0.000  0   261451  0  non-token data: nspam
0.000  018530  0  non-token data: nham
0.000  0   143599  0  non-token data: ntokens

0.000  0  1231533845  0  non-token data: oldest atime
0.000  0  1237223892  0  non-token data: newest atime
0.000  0  1237214668  0  non-token data: last  
journal sync atime
0.000  0  1237059740  0  non-token data: last  
expiry atime


0.000  05529600  0  non-token data: last expire  
atime delta


0.000  0   9311  0  non-token data: last expire  
reduction count




The db version is 3

You have 261,451 tokens that appeared in ‘spam’.
You have 18,530 tokens that appeard in ‘ham’

You have 143,599 tokens (remember, some tokens could appear in both  
spam and ham)


The oldest token is date -j -f %s 1231533845
Fri Jan  9 15:44:05 EST 2009

The newest token is date -j -f %s 1237223892
Mon Mar 16 13:18:12 EDT 2009




spamassassin: hosting service/cpanel problems user_prefs partially ignored -updated-

2009-03-13 Thread Dennis German
Updated, Thought you all might be interested  ( see  updates)

My intention is to observe false negatives (i.e. spam seen as ham) and
increase the score of one or more of the tests in an effort to cause  
additional spam to be detected.

I am using a hosting service where spamassassin configuration 
is  updatable by the cPanel system.
I can also modify ~/.spamassassin/user_perfs directly.
When I list /etc there is no mail directory 
(however I believe I am not looking at the true /etc )
...
When I modify ~/.spamassassin/user_prefs to include:

report_contact postmas...@real-world-systems.com
report_hostname Real-World-Systems.com
required_score 4
score URIBL_JP_SURBL 5 #was 1.5 
score URIBL_SBL 5  #was 1.5 
score URIBL_SC_SURBL 5 #was 1.5 
score URIBL_WS_SURBL 5 #was 1.5 


spam messages subject are correctly modified to indicate *SPAM* and
the X-SPAM-Report is correctly inserted with the revised hostname and  
contact and
scores for URIBL_* are increased to 5  
and
includes the message preview and  ((note 4.0 required))

  Content analysis details:   (4.0 points, 4.0 required)
pts rule name  description
 --  
0.9 RCVD_IN_SORBS_DUL  RBL: SORBS: sent directly from
dyna
  ...
  X-Spam-Flag: YES

The report is preceded by:
X-Spam-Status: Yes, score=4.0
X-Spam-Score: 40
X-Spam-Bar: 

There is no X-Spam-Checker-Version header which the documentation at
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html
says cannot be removed.

THE PROBLEMS:

1)Messages that are not flagged as spam have
X-Spam-Status: No, score=-0.7
X-Spam-Score: -6
X-Spam-Bar: /
X-Spam-Flag: NO


Aparently these messages are added by a module in cpanel which uses
spamassassin API's to process the email.

2) adding
 add_header all _TESTS(,)_
  has no effect on ham or spam.


3) adding
add_header all  DGG DGG
add_header ham  DGG DGG
add_header spam DGG DGG
has no effect on either spam or ham


Attempting to add headers via cpanel produces only
add_header all
add_header ham
add_header spam

Is my syntax for 3) correct?



spamassassin: attempt to process a single message fails at PerMsgStatus.pm line 164.

2009-03-13 Thread Dennis German

Attempting to see how spamassassin would score a message
I tried
 spamassassin  lottery.msg

[32179] warn: config: could not find site rules directory
check: no loaded plugin implements 'check_main': cannot scan! at
	/usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/PerMsgStatus.pm  
line 164.


message can be found at

http://real-world-systems.com/mail/lottery.msg



spamassassin: auto-whitelist : display/modify ?

2009-03-10 Thread Dennis German

Is there a utility to display auto-whitelist ?
Modify entries? remove entries?


Re: please help, getting hammered with snowshoe spam

2009-02-02 Thread Dennis Hardy

Yes, it has been a problem as there are so many domains used.  However..I
took everyone's earlier suggestions, including training Bayes against FN
snowshoe spam and adding the Barracuda RBL (BRBL), and this appears to
almost completely take care of the problem!!  So far I have been able to
remove all of my custom rules except for BRBL of course, and only a few of
these snowshoe spams get through now.  Nice!

Do people generally have good non-FP experience with BRBL?  I am thinking of
bumping up the score, but I get so much spam per day it is hard to check for
FPs with it enabled.  It seems like a great resource, will it be pushed out
with sa-update soon?  I believe it is enabled in svn, from what I've read.

Also I am using policyd-weight to do front-end greylisting if the DNSBL
checks trigger as this reduces load on the server.  Can anyone suggest how
to enable the BRBL in policyd-weight?  I'm not sure what values to use.

Again thank you for your help with this problem!  It is great to see SA
working so well now against it :-)


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21792616.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

Hi, I'm getting hammered by snowshoe spam :-(  I've added rules to try to
catch common formats of included URLs in the spam, but I'm wary of scoring
these rules too high because of the potential for false positives.  It's
hard to come up with other rules as the spam e-mail content is so generic. 
By default these spams score incredibly low (bayes, etc.)  In many cases,
the low bayes values are scoring negative, which completely offsets the few
positive scoring rules that I have added.

Are there other RBLs or domain checks or something that could be used to
possibly get more indication that a spam is a snowshoe spam from a bogus
domain?  I've also added a meta rule that combines URIBL_BLACK, DCC_CHECK,
and my rules...but spam still gets by many times because it scores so
low/negative otherwise.  Maybe I just need to score everything higher...?

Any thoughts/advice are appreciated :-)


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21627042.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

 why are those scores low? What gives them negative score?
 those rules have quite high score...

Here is an example (without my rules):  http://pastebin.com/m4400a74d

The ones that get through are relatively short and simple, and many are very
clean.  This example is just one that focuses on weight loss, some are
regarding tea or satellite companies or coffee makers or the like.  I worry
about increasing FPs of real e-mails by training of clean spams as spam,
when they are short and sweet and many times look like they could be
legitimate e-mails.

Also would training bayes on this sort of e-mail help if many things are
different between each e-mail, and if the e-mail is so short and relatively
clean?  Addresses change, company names change, sender domains are always
different, etc

I've been thinking about maybe writing an SA plugin that counts the three
repeated URL patterns that are always present in all of these spams, but I
don't know where to start in trying to do that.  I was hoping I could just
handle this with SA rules or something (like using another RBL or
something).

Thank you!

-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21627664.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

 Is this spam for snowshoes or some spam term?

Like a snowshoe spreads the load of a traveler across a wide area of snow,
some spammers use many frequently-changing IP addresses and domains to
spread out the spam load in order to dilute recipient reputation metrics and
evade filters.

see http://www.spamhaus.org/faq/answers.lasso?section=Glossary#233

 If the former, put some example up on a pastebin (not ehre!).

Yes already done:  http://pastebin.com/m4400a74d


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21627984.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

 I've been using this rule to knock some of these down:
   [...]
 Highly unusual to have a url like that in ham...
 I'm running a meta to bump up the score...

Yes, I've actually been doing the very same thing (URI detection and metas,
and then string matching in the tail part of the e-mail) !  However it has
been getting tedious maintaining the string list manually, because the 
Marketing and  Media etc. targets and addresses have been changing
far more frequently now.  They'll use them for a few days, then disappear
completely, and new ones will appear.  This type of spam is so incredibly a
pain...  Is there some more general way that this sort of thing could be
handled?


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21628143.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

 Can you repost that with full headers?

Yes, I have to wait for more to come through though as I have gotten into
the habit of just deleting the FNs.

 No DNSBL hits on the URI domain?

No, the domains change too quickly, so I almost never get DNSBL hits for
these.  I have DNSBL greylisting front-ending SA as well, and I get no hits
there either.  It is really annoying.  Usually someone will submit and
URIBL_BLACK will hit after a few though.  I've added a meta for the URL
check (below) and URIBL_BLACK and DCC_CHECK, maybe all I really need to do
is bump up the meta score for this combination?

 We'd need more than one sample URI to do a good job. Have you been
 collecting a corpus?

Not of a FN set.  I should collect this.

 I notice that this URI has a format that may be a good spam sign: the 
 domain name, followed by a long string of unpunctuated text gibberish.

Here is what I have been using (from previous help from this mail list!):

uri SSS_URI30 /\bhttp:\/\/[^\.\/]+\.(?i:com|net|info|biz)\/\w{30}\b/
uri SSS_URI30 1.5

this uri rule does work very well.  but they change the length sometimes, so
I have a few rules that handle different lengths.   Maybe I should use 29,31
instead of just 30 for example?

Am I being too conservative?  Should I consider bumping the score of this up
more?  And my meta up more perhaps?


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21628431.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

 your BAYES is misfiring. Ths difference between BAYES_05 and BAYES_99 is
4.6
 so you could have score of 5.7 if you'd have well-trained BAYES.

Yes, that would be great.  I will look at trying this.  I do get tens of
thousands of e-mails a day through this system though so it is hard to do
manual processes.  I need to play conservative and can't afford FPs at
all...


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21628480.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: please help, getting hammered with snowshoe spam

2009-01-23 Thread Dennis Hardy

Everyone has given very helpful feedback!  At present it definitely sounds
like I should tweak my rules and train my bayes.  I will try taking steps
here and see how it goes.

Thank you all so very much!


-- 
View this message in context: 
http://www.nabble.com/please-help%2C-getting-hammered-with-snowshoe-spam-tp21627042p21631249.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



need help with spamassassin URI rule

2008-12-08 Thread Dennis Hardy

Hi, I was hoping someone on this list could help me with a custom rule for
SpamAssassin.  I'm not an expert at perl regexps by at all, and spent a lot
of time trying to come up with a working match, all to no avail...

What I would like to match on is URLs that do _not_ start with a third level
domain entry, and end with .com, .biz, .info, etc.  For example,
http://hello.com/; (followed by more stuff) would match, and
http://www.hello.com/{...}; would _not_ match.

Actually another way of looking at it is just matching on a single domain,
without any preceding ., so basically //domain.ext/ is what I want to
match for, and if there is a preceding . in front of domain, that would
cause it to not match.  So http://foo.bar.net/; would not match, but
http://bar.net/; would.  Is this possible with perl regexps?

I've spent hours trying variations of different URI rules, but none of them
work (they always match the www. as well).  Here are some of my feeble
attempts:

[^w]{3}.*\.com\/
^(?:http?:\/\/)?[^\/]+(?!\/www)\.[^.]{7,}\.com\/
(?!www\.)   ...
[^\/]+(?!\/www)\.{1,}\.com\/

Some of the dot only checks I tried:

(?!\.)\w+?\.com
([^\.])\w+.*\.com\/

Again none of these work :-(

I really appreciate any any help you could provide!

.dh


-- 
View this message in context: 
http://www.nabble.com/need-help-with-spamassassin-URI-rule-tp20897907p20897907.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: need help with spamassassin URI rule

2008-12-08 Thread Dennis Hardy

 How about:
/:\/\/[^.\/]+\.[^\.\/]+\//

Hi John, sweet, this seems to work!  Could you help me with how to add a
list of com|net|info|biz|etc before the closing /, so it will match
against a list of known TLDs?

Many thanks, you are awesome :-)

.dh


-- 
View this message in context: 
http://www.nabble.com/need-help-with-spamassassin-URI-rule-tp20897907p20899285.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.



Re: Bounce back spam

2008-04-02 Thread Dennis Davis
On Thu, 27 Mar 2008, Jeff Koch wrote:

 From: Jeff Koch [EMAIL PROTECTED]
 To: users@spamassassin.apache.org
 Date: Thu, 27 Mar 2008 22:53:52 -0400
 Subject: Bounce back spam
 
 Our users are getting inundated with bounce-back, joe-job
 spam. We have the Vbounce.pm plugin enabled (v3.2.4) and have
 a 'whitelist_bounce_relays' with the name of the mailserver in
 the local.cf file and the 'failure notices', 'mail delay' and
 undeliverables don't seem to be getting any score at all.

For a non-SpamAssassin approach you might like to look at BATV:

http://en.wikipedia.org/wiki/Bounce_Address_Tag_Validation

http://tools.ietf.org/html/draft-levine-smtp-batv-00

http://mipassoc.org/batv/

BATV might interfere with some anti-spam measures, eg greylisting.
So you'd probably only want to turn it on for specific users
who are being badly affected.

Usual caveats apply:  I've no idea how difficult it would
be for you to install and I've never used it myself.
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


SA-update error

2008-03-26 Thread Dennis Clark
Using Spamassassin 3.1.8.  I haven't updated SA in about six months.  Ran 
SA-update -D using the default channel of updates.spamassassin.org, received 
error new version is 585884, skipped channel.

What exactly is going wrong here.  Has the sa update default channel been 
changed?



Re: How To Kill Spam Dead?

2007-05-31 Thread Dennis Kavadas

guys, even though we use SA for tagging... the real short to long term
solution is TMDA
just my 2c worth



On 5/31/07, jdow [EMAIL PROTECTED] wrote:


From: John D. Hardin [EMAIL PROTECTED]

 On Wed, 30 May 2007, John D. Hardin wrote:

 Take a look at the spamassassin procmail ruleset at
 http://www.impsec/org/~jhardin/antispam/ for a starting point.

 Bah. That URL should, of course, be:

  http://www.impsec.org/~jhardin/antispam/

THAT said, this following link might be a barely scratching the surface
good start. Robert Alan Soloway has been arrested for a host of spam
related offenses. Now, if they apply a gruesome enough punishment maybe
others will become a little less likely to spam.

Of course, we also need to go after his, and other spammer's, food chains
and nail some of those hides to the wall as well.

http://www.foxnews.com/story/0,2933,276573,00.html

{^_-}



Re: How To Kill Spam Dead?

2007-05-31 Thread Dennis Kavadas

most, if not all spam have spoofed addresses headers that do not resolve to
a valid account on any host, that said, how is it a problem ?


On 5/31/07, Matt Kettler [EMAIL PROTECTED] wrote:


John Rudd wrote:
 Per Jessen wrote:
 Dennis Kavadas wrote:

 guys, even though we use SA for tagging... the real short to long term
 solution is TMDA

 I remember one of my friends saying just that - about 5 years ago.  It
 might be fine for personal email, but it's not very useful in a
 business context.  Too much end-user education required.

 That, and TDMA is a blight upon the internet.  It is at best
 misguided, and at worst irresponsible, to use challenge-response email
 systems.


Agreed. Challenge response systems attempt to solve the problem of spam
by forwarding it to someone else and hoping they'll use good judgment
for you and only approve mail they actually sent. You're turning your
spam problems into theirs.

The problem boils down to forged spam emails. If you're using TMDA and a
forged spam comes in, your TMDA system in-turn spams that victim of
forgery. After spaming them, you're hoping that they'll be nice and
delete the message for you, because you're too lazy to do it yourself.

My question is, why should I not activate the spam, after your TMDA
system has chosen to intrude on MY mailbox in an attempt to solve YOUR
spam problems?

Do I have any prior agreement with you to perform this task properly?
Are you paying me for my time? Oh, that's right, you're not paying me,
nor have you previously asked me if it's ok to do this to my mailbox, so
I'm free to do as I please..

Well then, who am I to stop you from getting advertisements you might
actually want?

*click*

Seriously, I take this approach to every TMDA challenge I get. I
encourage everyone to do the same. It is not your responsibility to
filter people's spam for them, so take the time and return the problem
back to its original owner.







Re: How To Kill Spam Dead?

2007-05-31 Thread Dennis Kavadas

if i had never meet you before and if i asked you to knock on my door before
barging in, would you believe that was to much to ask of you ?




On 6/1/07, jdow [EMAIL PROTECTED] wrote:


From: Per Jessen [EMAIL PROTECTED]
Dennis Kavadas wrote:

 guys, even though we use SA for tagging... the real short to long term
 solution is TMDA

I remember one of my friends saying just that - about 5 years ago.  It
might be fine for personal email, but it's not very useful in a
business context.  Too much end-user education required.




TMDA involves challenge/response. I ***NEVER*** reply to spam.
A challenge, from a challenge response system is spam. Hence I
***NEVER*** reply to challenges. I have rerouted messages to idiots
who use it to tell them that their email host is broken and is very
unlikely to allow mail from me through. I suggest they get a real mail
service.

{^_^}



Re: How To Kill Spam Dead?

2007-05-31 Thread Dennis Kavadas

why ?



On 5/31/07, John Rudd [EMAIL PROTECTED] wrote:


Per Jessen wrote:
 Dennis Kavadas wrote:

 guys, even though we use SA for tagging... the real short to long term
 solution is TMDA

 I remember one of my friends saying just that - about 5 years ago.  It
 might be fine for personal email, but it's not very useful in a
 business context.  Too much end-user education required.

That, and TDMA is a blight upon the internet.  It is at best misguided,
and at worst irresponsible, to use challenge-response email systems.




Re: How To Kill Spam Dead?

2007-05-31 Thread Dennis Kavadas

why isn't it useful in a business context ?
there sender gets a challange once ! ...how is that a problem ?



On 5/31/07, Per Jessen [EMAIL PROTECTED] wrote:


Dennis Kavadas wrote:

 guys, even though we use SA for tagging... the real short to long term
 solution is TMDA

I remember one of my friends saying just that - about 5 years ago.  It
might be fine for personal email, but it's not very useful in a
business context.  Too much end-user education required.


/Per Jessen, Zürich




Re: How To Kill Spam Dead?

2007-05-31 Thread Dennis Kavadas

i think we all need to read the TMDA FAQ ! :-)




On 6/1/07, Rick Macdougall [EMAIL PROTECTED] wrote:


jdow wrote:
 From: Rick Macdougall [EMAIL PROTECTED]

 Dennis Kavadas wrote:
 if i had never meet you before and if i asked you to knock on my
 door before barging in, would you believe that was to much to ask of
 you ?

 If you are a business or someone looking for help, you either have an
 open door policy or you asked for someone to help you out.

 Asking them to knock first is just rude and, in the case of
 businesses, standing in the way of doing business, since your clients
 can not easily get a hold of you.

 Actually the situation is the reverse of the stranger at the door
 situation.
 THEY are the stranger to whom I am replying. I've not hit a corporation
 stupid enough to turn me away with a C/R.

 All the C/R's I have experienced are from ME answering THEIR email. That
 in NO WAY matches the stranger at the door. HE is the stranger at the
 door
 not me. Most of the C/Rs have been to messages on mailing lists. That is
 as utterly unfriendly as you can get. And, again, HE is the stranger
 at the door
 I was trying to help.

 That level of rudeness does not set well with me. Call me a crotchety
old
 bitch if you want. But I will continue to reject C/R, often with
 extreme prejudice,
 into the foreseeable future.

 {^_^}
Heh, I think I love you :)

Rick




Re: Does anyone catch this....

2007-05-14 Thread Dennis Davis
On Mon, 14 May 2007, Duncan Hill wrote:

 From: Duncan Hill [EMAIL PROTECTED]
 To: users@spamassassin.apache.org
 Date: Mon, 14 May 2007 11:41:24 +0100 (BST)
 Subject: Re: Does anyone catch this
 
 On Mon, May 14, 2007 11:32, Matt Hampton wrote:
  http://www.coders.co.uk/slipped.through.txt
 
 
  It has sailed through both a SA3.1.8 and SA3.2.0 (3.2.0-pre2-r512851)
  running on recent versions of MailScanner
 
 The ClamAV engine tends to work well on a large number of that
 type of phish.  Local testing shows DCC hitting it, but that's
 about it.  Doesn't help that Halifax don't publish SPF records.

In particular the Sanesecurity additions to ClamAV detect this as:

Html.Phishing.Bank.Sanesecurity.06030604

We've detected (and rejected) over 1300 copies of this particular
phishing scam over the last couple of weeks or so.
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


RE: Does anyone catch this....

2007-05-14 Thread Dennis Davis
On Mon, 14 May 2007, Rick Cooper wrote:

 From: Rick Cooper [EMAIL PROTECTED]
 To: 'SpamAssassin' users@spamassassin.apache.org
 Date: Mon, 14 May 2007 09:04:57 -0400
 Subject: RE: Does anyone catch this

...

 I just sent Steve an updated script that accommodates the trailing
 back slash the debian adds to the clam db dir in the debug output
 and add -m 1 to the grep so it short circuits finding the clam
 db dir (so it now takes less than a second), and I added rsync
 for the MSRBL-* files since that site not only supports it but
 prefers it be handled that way. I would imagine Steve will have it
 up sometime today, I have been testing it since he made the last
 change to the mirroring methods last week.

[Posted to both the [EMAIL PROTECTED] and
 users@spamassassin.apache.org mailing lists.  Please followup
 appropriately.]

Steve tells me he has just updated the download script on the main
site (www.sanesecurity.com).  Blog additions are coming, but might
not make it until tomorrow.
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


Cant locate object method 'new' via package IO::Zlib

2007-04-25 Thread Dennis Clark
I keep getting this error - Cant locate object method 'new' via package 
IO::Zlib at /usr/bin/sa-update line 671 - when attempting to run sa-update.  
It worked fine when I ran it about 10 months ago (im way behind).
 
Using SA version 3.1.3 on Fedora.


Re: Blocking mail from one specific user to another

2007-03-23 Thread Dennis Davis
On Fri, 23 Mar 2007, Michael Connors wrote:

 Received: from [87.198.136.186] (helo=[10.1.1.125])
 by mail.go2.ie with esmtpa (Exim 4.52)
 id 1HUjCF-0005Fo-62; Fri, 23 Mar 2007 12:48:43 +
 Message-ID: [EMAIL PROTECTED]
 Date: Fri, 23 Mar 2007 12:48:44 +
 From: Michael Connors [EMAIL PROTECTED]
 To: Loren Wilton [EMAIL PROTECTED]
 CC:  users@spamassassin.apache.org
 Subject: Re: Blocking mail from one specific user to another
 
 I see, I didn't understand the syntax of the rules before, now I
 understand.  Thank you, I will try that.

As indicated elsewhere in this thread, this is best done by the MTA
and not SpamAssassin.

You appear to be using exim as your MTA.  At least that's what's
indicated by:

 Received: from [87.198.136.186] (helo=[10.1.1.125])
 by mail.go2.ie with esmtpa (Exim 4.52)
 id 1HUjCF-0005Fo-62; Fri, 23 Mar 2007 12:48:43 +

So have a look at exim's wikki.  This specific case is covered in:

http://www.exim.org/eximwiki/FAQ/Policy_controls/Q0710
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


Getting strange messages, bayes subvert attempts?

2007-02-21 Thread Dennis Krøger
Hi, I've been getting quite a few strange messages in my inbox lately,
they look like this:  (I'm descring them instead of posting them in
full, because a lot probably already trained them as spam)

Starts with a hi and a call me (always exactly the same), next line
is random, next line talks about how poor I am for getting so much spam
(again same each time), then another random line, then this hex line,
always the same as well (I've put in spaces and the word next, again
for this message not to be flagged as spam, but in the mail, it's one
continuous line):

6D71 next 7479 next 6A6E next 6A6D next 3768 next 696A next 716E next
7273 next 6845 next 7538 next 3370

The message in itself is not an attempt to spam, (as I can see, can't
find anything the want to sell us, at least :)), but the pattern is VERY
strange, why mass mail this, if not to try and confuse filters, or
something like that? It's probably nothing, just want make sure that we
know about this, just in case the bastards found a hole.

Regards,
Dennis Du Krøger


smime.p7s
Description: S/MIME cryptographic signature


Re: Getting strange messages, bayes subvert attempts?

2007-02-21 Thread Dennis Krøger

Doh, it's easier with some examples, didn't think of posting a link
until I saw another do it in the archives. (sorry for being a newbie :s)

http://www.hp23c.dk/~d/strangespam/

Notice how 3 of the lines stays exactly the same, while 2 are random.

Regards,
Dennis



smime.p7s
Description: S/MIME cryptographic signature


Re: phone number spam

2006-12-15 Thread Dennis Davis
On Fri, 15 Dec 2006, Rajkumar S wrote:

 From: Rajkumar S [EMAIL PROTECTED]
 To: users@spamassassin.apache.org
 Date: Fri, 15 Dec 2006 14:51:06 +0530
 Subject: phone number spam
 
 for last 2 days I am getting lot's of spam with phone numbers rather
 than website or email address to contact spammer. Unfortunately the
 only rules that are matching for these spam are the rbl ones. Any one
 else seeing this type of new spam? Any one with some rule ideas?
 
 A sample is at http://pastebin.ca/279774

The sample indicates you're running SpamAssassin-3.0.3.  That's
*old*.  Seriously consider upgrading to SpamAssassin-3.1.7.  Then
run sa-update.  Install the Botnet plugin.  That should score on the
sample you've given.  Also look at installing selected rules from
the SpamAssassin Rules Emporium if you aren't already doing so.
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


Re: new Botnet plugin version soon

2006-11-30 Thread Dennis Davis
On Thu, 30 Nov 2006, John Rudd wrote:

 From: John Rudd [EMAIL PROTECTED]
 To: users@spamassassin.apache.org,
 CommuniGate Pro Discussions [EMAIL PROTECTED],
 MailScanner discussion [EMAIL PROTECTED]
 Date: Thu, 30 Nov 2006 04:06:55 -0800
 Subject: new Botnet plugin version soon

...

 Question 2: someone asked why my module is Botnet instead of
 Mail::SpamAssassin::Plugin::Botnet.  The answer is: when I
 first started this (and this is/was my first SA Plugin authoring
 attempt), I tried that and it didn't work.  If someone wants to
 look at it, and figure out how to make that work

I prefer to have all the SpamAssassin plugins grouped together where
the default install puts them.  This is in the directory:

/usr/local/libdata/perl5/site_perl/Mail/SpamAssassin/Plugin/

on my OpenBSD boxes.

So I altered Botnet.pm so the line:

package Botnet;

now reads:

package Mail::SpamAssassin::Plugin::Botnet;

and placed it in the above directory.

The line:

loadplugin  BotnetBotnet.pm

in /etc/mail/spamassassin/Botnet.cf was altered to:

loadplugin Mail::SpamAssassin::Plugin::Botnet

It works a treat.

I did something similar for the FuzzyOcr.pm plugin.

 (but still have the files located in /etc/mail/spamassassin) I
 would happily incorporate it.

Well, you *could* do this with soft links.  But that would be
a terrible hack :-(
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


Re: Loads of 'xxx wrote:' Spam

2006-11-28 Thread Dennis Davis
On Mon, 27 Nov 2006, Theo Van Dinter wrote:

 From: Theo Van Dinter [EMAIL PROTECTED]
 To: users@spamassassin.apache.org
 Date: Mon, 27 Nov 2006 16:32:50 -0500
 Subject: Re: Loads of 'xxx wrote:' Spam

...

  Has anyone else seen this?  Is there a rule I can use to block
  this?  The names change ALL the time, so it would have to be
  something dynamic.
 
  Does anyone have something I could use?

 As has been the suggestion for the past X months, run sa-update. :)

Yup, works for me.

Note that the Botnet plugin (subject of another thread on this list)
may help with hosts that slip past any RBLs you use.  Here's the
results for one of these I recently received in my spam folder:


X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on merckx.bath.ac.uk
X-Spam-Level: 
X-Spam-Status: Yes, score=8.9 required=6.0 tests=BOTNET,BOTNET_CLIENT,
BOTNET_IPINHOSTNAME,RCVD_FORGED_WROTE,SARE_LWSHORTT,SARE_MLB_Stock2,
SARE_PROLOSTOCK_SYM1 autolearn=disabled version=3.1.7
X-Spam-Report: 
*  2.8 RCVD_FORGED_WROTE Forged 'Received' header found ('wrote:' spam)
*  0.0 BOTNET_IPINHOSTNAME Hostname contains its own IP address
*  1.7 SARE_MLB_Stock2 BODY: SARE_MLB_Stock2
*  0.8 SARE_LWSHORTT BODY: SARE_LWSHORTT
*  1.7 SARE_PROLOSTOCK_SYM1 BODY: Last week's hot stock scam
*  2.0 BOTNET_CLIENT Hostname looks like a client hostname
*  0.0 BOTNET Any Botnet rule hit
Received: from 89-139-185-37.bb.netvision.net.il ([89.139.185.37] helo=mafioso)

(I've tweaked the BOTNET rules.  It would score more with a standard
 configuration.)
-- 
Dennis Davis, BUCS, University of Bath, Bath, BA2 7AY, UK
[EMAIL PROTECTED]   Phone: +44 1225 386101


  1   2   >