I had a spam just slip through (only one so far this week), strictly HTML
with graphics in a table (only text: "**"), bayes rating 50%, initial SA
score 5.8 of a required 9.

In addition to sa-learn, I created several rules to prevent similar
penetrations. I'm going to present my current philosophy on these, and
would be very interested in hearing how others feel about these types of
rules and/or scores.

uri       L_u_time4more  /time4more\.net/i
describe  L_u_time4more  Body text references known spammer
score     L_u_time4more  9.00  # graphics-only spam Aug 4 03

There were five graphics in the HTML. Each was linked to
www.time4more.net/link/something -- To me this is proof absolute that
time4more.net is the source of the spam, and it earns my top spam rating
(equal to my required hits).

Note that I do scan my corpus anyway, and I find it hits a total of 7
spam, July 24 to today's, and no non-spam. This just happened to be the
first from that source that snuck through SA's rules.

header    L_s_CorelWPOffice  Subject =~ /(?:Corel|WordPerfect).{1,15}Office/i
describe  L_s_CorelWPOffice  Subject apparently mentions software for sale
score     L_s_CorelWPOffice  0.4     # 2 spam, 0 ham, as of Aug 4, 2003

The subject heading for this spam was:
> LAST CHANCE--Get 80% off Corel Wordperfect Office 11

I would expect to receive a fair amount of ham which references Corel, or
WordPerfect, and maybe even WP Office, so this is a low-scoring rule, but
with three full months of corpus, this rule matched only this spam and
one other. A scan for just WordPerfect matched the same two spam. A scan
for Corel by itself matched these two plus one ham. A match on Office by
itself matched a bunch of ham.

So a subject which references WP Office seems to be a valid suggestion of
spam, but with such a small sample (only two hits), I keep the score
at/under 0.5 (approx 10% of my required hits).

header    L_s_LastChance  Subject =~ /LAST\ CHANCE/i
describe  L_s_LastChance  Subject claims it is the last chance for something
score     L_z_LastChance  0.1     # more ham than spam as of Aug 4, 2003

Scanning for "last chance" in the subject header netted 9 hits, 2 of
which were spam. Philosophy: It's reasonable to expect that an email with
this subject heading could be spam. It's reasonable to expect that a
company we (my recipients) have relationships with could advertise a
valid last chance offer. It's possible though rare for personal email to
mention a last chance. Therefore, though I think it's worth having a rule
for this, I most definitely want to keep the score at a minimum.

header    L_hr_lattelekom  Received =~ /lattelekom\.net/
describe  L_hr_lattelekom  Spam passed through lattelekom.net relay
score     L_hr_lattelekom  0.1         # 1 spam, Aug 4, 2003

The spam passed through mx.lattelekom.net just before reaching my server.
This appears to be an ISP of some kind in Latvia, if I read things
correctly. Scanning for lattelekom.net, this spam is the only hit.

I therefore theorize that lattelekom.net has an open relay, and the
spammer is utilizing it. I've created the rule on this theory, but with a
sample of one, I've given it only a minimal score. I can increase the
score if it appears in other spam that sneaks through. (And if the 0.1
pushes other relayed spam over my required hits, then I won't need to
increase it.)

Which of my actions would you agree with, and which would you disagree
with?  And more importantly, why?

Thanks.

Bob Menschel




-------------------------------------------------------
This SF.Net email sponsored by: Free pre-built ASP.NET sites including
Data Reports, E-commerce, Portals, and Forums are available now.
Download today and enter to win an XBOX or Visual Studio .NET.
http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/01
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to