On 4/30/2010 9:32 PM, Andy Schmidt wrote:
<snip/>
But your documentation of the reputation system has a graph that shows that there is yet another category: "WHITE".
I don't know the details of Declude's impelementation. Presumably they could (or maybe even do) implement WHITE.
The SNFIPREP tests does offer the ability to define at what decimal value (between -1 and +1, in .1 increments) a weight can be subtracted. But the question is - is that SENSIBLE use of your reputation database? Per example, could -0.8 be a sensible threshold to give an email "credit" for coming from a reputable IP source?
I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point.
My guess is based on using a combined score value from the IP reputation that combines the confidence figure and the probability figure. In that case only a strongly negative p coupled with a strong c would result in a -0.8.
Or is it better to let the "good" reputation be considered AFTER the content scan and then use the "combined" exit code?
As I understand it Declude uses a wheighting system --- except for some short-circuit abilities that means all tests are run, their scores are added together, and then the total is used to determine the disposition of the message. I don't think there is an 'AFTER' in this case.
The IP reputation test is useful in cases where a message might be too new to hit a pattern match and where the IP reputation is not quite strong enough to be in one of the GBUdb envelopes. In such a case it might be useful to combine the 'analog' reputation score with the scores from other tests to push the message over the fence one way or another... at least that's how the test was designed to work in the API we provide.
It sounds like you're describing the IP Reputation test as having thresholds. That's an interesting way to do it (I haven't looked to see if it is actually that way)... a better way to do it would be to scale the result so that from 0 to -1 the "negative" weight (let's pick a factor of 5) would rise linearly from 0 to -5 and similarly a positive going reputation would scale linearly from 0 to +5 as the API result scaled from 0 to +1.
The API result holds 0 as meaning "I don't know" --- either because the confidence figure (c) is 0 or because the probability figure (p) is 0 (meaning a 50% chance of spam or ham). The farther away from 0 you get the more certain the statistics.
Hope this helps, _M --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [email protected], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.
