I took the last 10,000 non-mailing-list E-mails sent to us, and 2.5% would be blocked by your regex. That's selecting *our* data, not selecting data to fit my position. If you have better data, please share it.
The false positive problem is this, where "positive" is a "reject" :
positives = true positives + false positives
Your "data" (non-rejected messages) is not even part of the equation. Your data is the "negatives", ie, the non-rejects.
I consider false positives to be percentage of total rejects, not a % of total legit. I don't care about legit mail. It's not the problem.
That's not how anti-spam companies calculate FPs. If you have 9,990 spams and 10 legitimate E-mails, and you block all 10,000 E-mails, you don't have a .1% FP ratio. You have a 100% FP ratio.
Come on, that's BS and you know it.
I block 10,000 messages as positive rejects (eg, using the "DSL subscriber PTR hostname" filter. Of those, 10 are legit, "false positives".
Rate of the FP in Total P is 10/10000 = 0.001 = 0.1 %
A amplifying factor for the "subscriber PTR hostname" filter is that out of 1000 IPs, say 10 are legit mailers, and 990 are spammers.
That looks like a "1% problem" but it's not, when you consider that a legit mailer will send out a few dozen or few 100 msgs per day to a range of MXs (ie, 1 or 2 msgs/day to a given MX), while the spammer will 100's of msgs PER HOUR to just your MX. and the world's DSL/cable subscribers will inundate your MX with 1000s of msgs.
The "percentage-of-legitimate-mail-caught-to-the-total-amount-of-mail-including-spam" ratio can be useful , but it isn't the FP ratio.
Then why bring it up? I didn't. I don't consider non-rejected messages (your "data") as part of the "false positive" discussion.
I'm not saying that using the reverse DNS entry filtering in spam control is wrong (actually, it's great). I'm saying that [1] People should only *block* E-mail based on that *if* they are willing to accept a lot of false positives
A "lot" is subjective, and it's always a judgement call.
and [2] Everyone else should *only* use the reverse DNS entry filtering *if* it is used as part of an overall spam system (IE only blocking E-mail as a result of that *and* other failures).
Compound, weighted conditions are marginally more accurate, but they have the requirement that all spam must be completely received and scanned, and a lot of mail admins refuse, or can no longer afford, that operational wear-and-tear and bandwidth waste.
Len
_____________________________________________________________________ http://MenAndMice.com/DNS-training: New York; Seattle; Chicago IMGate.MEIway.com: anti-spam gateway, effective on 1000's of sites, free
To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/ Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/
