> Sandy, I think you're missing the point.

I don't think so.

A Subject with a pair of [square brackets] in it is alone, practically
speaking,  a  0%  indicator  of  spam,  unless  your  mail  traffic is
separately  insured  against  such incidents (which I doubt it is). As
such, it should comprise 0% of your spam weight.

In  the  same fashion, just having a Subject that ends with a question
mark  is  probably  a .0001% indicator of spam, so you cannot reliably
assign more than .0001% ~= 0% of your spam weight to it.

What you need to make most of your "single character filter" effective
is  a  threshold  above  which a fixed or computed aggregate weight is
assigned,  and  below  which  NO  weight is assigned. Declude does not
presently  allow  for this, though it's been in the suggestion box for
some  time.  I'm not saying that there aren't *some* single characters
that  deserve  some weight--note that I only quoted about half of your
message in mine.

> Even  the Declude JunkMail list now gets 4 points for having [ and ]
> in the subject, but it still comes through fine.

And  what  number  of  spam messages are you really catching with that
part  of the filter? What happens when one of your users subscribes to
a listserv that has a few other marks against it (REVDNS, et al.), and
then  those  four points push it over the top, while not really having
any real effect in the other direction?

> If we try to prevent all false positives we get a lot of junk in our
> inboxes.

I do not approve of any false positives that prevent message delivery.
Individual  components  in  a weighted system like Declude's that have
high  sensitivity are not false positives in their own right. But when
the  components are disproportionately insensitive to legitimate mail,
they will inevitably lead to FPs in final weighting.

> I  don't  need this filter to have most list emails held for review.

*Most*  list e-mails? We don't have this issue, and if we did, none of
our  clients  would retain us. It's unacceptable for us to assume that
mass  dispatches are not business-critical simply because they are not
person-to-person. Your site's mileage, I note again, may vary; I guess
you're  right  that if you're already HOLDing list e-mails as a normal
thing, holding with a higher weight won't mean much.

> My  weights  are  low  enough  to  make little or no difference with
> legitimate emails.

Doesn't look that way to me. The only weights we assign are those that
have  a  statistically significant chance of being spam. While some of
your  single-character  filters are just fine, the ones I commented on
do not, in my estimation.

> It's these that will get held:

> What is G.E.N.ERIC VI.A.G.R.A?   (16 points)
[EMAIL PROTECTED] -->> 75% D1SC0UNT!!     nisbabct jvvjhgyxmk   (22 points)

Well,  yes,  if you're using your single-character obfuscation test as
your ONLY Declude test, that's true. But I was not assuming that you'd
chosen  such  a  limited  implementation. I state again that what most
sites  need--maybe  not  yours,  by a stroke of luck--is an all-in-one
test  with  intelligently computed weight, such as SPAMCHK or SNIFFER,
and  not trying to make Declude's FILTER test more sensitive than it's
designed to be.

-Sandy


------------------------------------
Sanford Whiteman, Chief Technologist
Broadleaf Systems, a division of
Cypress Integrated Systems, Inc.
e-mail: [EMAIL PROTECTED]
------------------------------------

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to