RE: [Declude.JunkMail] Sniffer IP Reputation for "white" listing
As Pete already provided input on this. I am not going to prolix the answer other than to say when implementing Message Sniffer we abided by the Pete's advice "Since many legitimate ISPs also produce a lot of spam it might be useful to apply a bias to this weight so that these systems appear closer to zero." So currently we do not allow for a negative value as a BASEPOINT, with that said if you think it is really important to be able to use a negative value as you have described in your post, let me know and I can add it to the dev list. David Barker VP Operations Declude Your Email security is our business 978.499.2933 office 978.988.1311 fax <mailto:dbar...@declude.com> dbar...@declude.com From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Andy Schmidt Sent: Saturday, May 01, 2010 1:51 PM To: declude.junkmail@declude.com Subject: RE: [Declude.JunkMail] Sniffer IP Reputation for "white" listing Hi Pete, Funny - our messages overlapped. But I'm glad I was on the right track with my suspicions. Hopefully this will help Declude to refine things. >> a better way to do it would be to scale the result so that from 0 to -1 the "negative" weight (let's pick a factor of 5) would rise linearly from 0 to -5 and similarly a positive going reputation would scale linearly from 0 to +5 as the API result scaled from 0 to +1. << Right - that's the same scheme I just pointed out to Dave myself - except in my case you could pick a distinct factor for the "-" vs. the "+" side of the scale (because Declude already has that option anyhow) (( Abs(Reputation Value) * 10 ) - Base Value) * [Pos or Neg]WeightFactor = Final Weight For this line in the Declude config: IPREPUTATION SNFIPREP x 0 2 -1 it would results in weights between +20 and -10, e.g.: Reputation 0.0: ( ( 0.0 * 10 ) - 0 ) * 2 = 0 Reputation 0.3: ( ( 0.3 * 10 ) - 0 ) * 2 =6 Reputation 1.0: ( ( 1.0 * 10 ) - 0 ) * 2 = 20 Reputation -0.3: ( ( 0.3 * 10 ) - 0 ) * -1 = -3 Reputation -1.0: ( ( 1.0 * 10 ) - 0 ) * -1 = -10 Here's an important question, though: Do you have a distribution chart for the reputation scale? It of course makes a HUGE different, whether the distribution of reputations reported for the inflow of email is evenly distributed between -1.0 and 0.1, or whether it is a bell curve where 80% are in the "center" area, or whether it's some sort of exponential curve that has very few with "good" reputation, a modest amount around the 0 point, and then expentionally increasing towards the bad and turn reputations? This way one could decide what factors to use for the + and - sides and where to set the "mid" point (Declude allows you to shift the mid-point left and right. >> I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point.<< Thank you - that means in their "default" (sample) config file, they really should adjust the midpoint away from "0" to "-8" (they multiply the reputation scale by 10 to be able to work with integers) IPREPUTATION SNFIPREP x 0 2 -1 probably to IPREPUTATION SNFIPREP x -8 2 -1 but I'd have to check with Dave to see if "-8" will indeed set the midpoint to -0.8 or if the sign has to be reversed. Thanks for taking the time to help all of us understand Sniffer in the context of the Declude integration. I'm very happy that Declude took the time and integrated the product. I just would like to make sure it comes with an implementation sample that is a good enough compromise for "day-to-day" use. Best Regards, Andy -Original Message----- From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Pete McNeil Sent: Saturday, May 01, 2010 11:57 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] Sniffer IP Reputation for "white" listing On 4/30/2010 9:32 PM, Andy Schmidt wrote: > But your documentation of the reputation system has a graph that shows that > there is yet another category: "WHITE". > I don't know the details of Declude's impelementation. Presumably they could (or maybe even do) implement WHITE. > The SNFIPREP tests does offer the ability to define at what decimal value > (between -1 and +1, in .1 increments) a weight can be subtracted. But the > question is - is that SENSIBLE use of your reputation database? Per example, > could -0.8 be a sensible threshold to give an email "credit" for coming from > a reputable IP source? > I'm guessing on how that test is implemen
Re: [Declude.JunkMail] Sniffer IP Reputation for "white" listing
On 5/1/2010 1:51 PM, Andy Schmidt wrote: Right - that's the same scheme I just pointed out to Dave myself - except in my case you could pick a distinct factor for the "-" vs. the "+" side of the scale (because Declude already has that option anyhow) I was trying to provide a simple example. In practice it would probably be better to have separate positive and negative going weights. Here’s an important question, though: Do you have a distribution chart for the reputation scale? It of course makes a HUGE different, whether the distribution of reputations reported for the inflow of email is evenly distributed between -1.0 and 0.1, or whether it is a bell curve where 80% are in the “center” area, or whether it’s some sort of exponential curve that has very few with “good” reputation, a modest amount around the 0 point, and then expentionally increasing towards the bad and turn reputations? This way one could decide what factors to use for the + and – sides and where to set the “mid” point (Declude allows you to shift the mid-point left and right. The research we have shows that the curve is largely bipolar and heavily weighted toward the black. Supposedly "good" ISP's frequently produce > 90% spam from their systems!! Indeed one of the mistakes we made during early testing was to assume that anybody producing more than 80% spam was probably not to be trusted and that the remaining 20% might be explained largely by false negatives --- we were very wrong about that. (SCIENCE!) On the other hand, good reputation values do occur and when there is a strong confidence value they can often be trusted. BUT NOT ALWAYS... When one of the new pre-tested campaigns hits a fresh bot-net some of the sources can gain strong positive reputations for a short time. Our real-time IP conflict instrumentation has shown us a clearer picture of this -- while we knew it was possible (even likely) we were surprised to see how often solid new rules for these campaigns will be met with auto-panics in the field when first deployed. For this reason we chose a nonlinear curve to boil the statistics down to a single value. R = sign(p) * sqr(abs(p) * c) From: https://svn.microneil.com/websvn/filedetails.php?repname=PKG-SNF-SDK-WIN&path=%2Ftrunk%2FSNFMultiDll%2Fsnfmultidll.cpp default: { // Ugly means we calculate the reputation Reputation = // figure from the statistics. Start by sqrt(fabs(Tester.G.Probability() * Tester.G.Confidence())); // combining the c & p figures then if(0 > Tester.G.Probability()) Reputation *= -1.0; // flip the sign if p is negative. } I recommend a softer weight for "good looking" IP reputations -- something calculated to negate "iffy" tests and avoid false positives. For "bad looking" IP reputations a strong weight is generally sound provided there are some countering weights to balance it off when one of those "Good" ISPs is delivering the message in the midst of their 80% spam flood. >> I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point.<< Thank you – that means in their “default” (sample) config file, they really should adjust the midpoint away from “0” to “-8” (they multiply the reputation scale by 10 to be able to work with integers) You know -- a lot of the professional filtering houses that started with (or still use) Declude adjusted their scales up to 100 or higher in order to give more room for fine adjustments. When we were developing MDLP we preferred that as well. The choice of scale is a matter of opinion and application -- and in a weight driven system it's always up for adjustment as every weight interacts with every other weight. Best, _M -- President MicroNeil Research Corporation www.microneil.com ---This E-mail came from the Declude.JunkMail mailing list. Tounsubscribe, just send an E-mail to imail...@declude.com, andtype "unsubscribe Declude.JunkMail". The archives can be foundat http://www.mail-archive.com.
RE: [Declude.JunkMail] Sniffer IP Reputation for "white" listing
Hi Pete, Funny - our messages overlapped. But I'm glad I was on the right track with my suspicions. Hopefully this will help Declude to refine things. >> a better way to do it would be to scale the result so that from 0 to -1 the "negative" weight (let's pick a factor of 5) would rise linearly from 0 to -5 and similarly a positive going reputation would scale linearly from 0 to +5 as the API result scaled from 0 to +1. << Right - that's the same scheme I just pointed out to Dave myself - except in my case you could pick a distinct factor for the "-" vs. the "+" side of the scale (because Declude already has that option anyhow) (( Abs(Reputation Value) * 10 ) - Base Value) * [Pos or Neg]WeightFactor = Final Weight For this line in the Declude config: IPREPUTATION SNFIPREP x 0 2 -1 it would results in weights between +20 and -10, e.g.: Reputation 0.0: ( ( 0.0 * 10 ) - 0 ) * 2 = 0 Reputation 0.3: ( ( 0.3 * 10 ) - 0 ) * 2 =6 Reputation 1.0: ( ( 1.0 * 10 ) - 0 ) * 2 = 20 Reputation -0.3: ( ( 0.3 * 10 ) - 0 ) * -1 = -3 Reputation -1.0: ( ( 1.0 * 10 ) - 0 ) * -1 = -10 Here's an important question, though: Do you have a distribution chart for the reputation scale? It of course makes a HUGE different, whether the distribution of reputations reported for the inflow of email is evenly distributed between -1.0 and 0.1, or whether it is a bell curve where 80% are in the "center" area, or whether it's some sort of exponential curve that has very few with "good" reputation, a modest amount around the 0 point, and then expentionally increasing towards the bad and turn reputations? This way one could decide what factors to use for the + and - sides and where to set the "mid" point (Declude allows you to shift the mid-point left and right. >> I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point.<< Thank you - that means in their "default" (sample) config file, they really should adjust the midpoint away from "0" to "-8" (they multiply the reputation scale by 10 to be able to work with integers) IPREPUTATION SNFIPREP x 0 2 -1 probably to IPREPUTATION SNFIPREP x -8 2 -1 but I'd have to check with Dave to see if "-8" will indeed set the midpoint to -0.8 or if the sign has to be reversed. Thanks for taking the time to help all of us understand Sniffer in the context of the Declude integration. I'm very happy that Declude took the time and integrated the product. I just would like to make sure it comes with an implementation sample that is a good enough compromise for "day-to-day" use. Best Regards, Andy -Original Message- From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Pete McNeil Sent: Saturday, May 01, 2010 11:57 AM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] Sniffer IP Reputation for "white" listing On 4/30/2010 9:32 PM, Andy Schmidt wrote: > But your documentation of the reputation system has a graph that shows that > there is yet another category: "WHITE". > I don't know the details of Declude's impelementation. Presumably they could (or maybe even do) implement WHITE. > The SNFIPREP tests does offer the ability to define at what decimal value > (between -1 and +1, in .1 increments) a weight can be subtracted. But the > question is - is that SENSIBLE use of your reputation database? Per example, > could -0.8 be a sensible threshold to give an email "credit" for coming from > a reputable IP source? > I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point. My guess is based on using a combined score value from the IP reputation that combines the confidence figure and the probability figure. In that case only a strongly negative p coupled with a strong c would result in a -0.8. > Or is it better to let the "good" reputation be considered AFTER the content > scan and then use the "combined" exit code? > As I understand it Declude uses a wheighting system --- except for some short-circuit abilities that means all tests are run, their scores are added together, and then the total is used to determine the disposition of the message. I don't think there is an 'AFTER' in this case. The IP reputation test is useful in cases where a message might be too new to hit a pattern match and where the IP reputation is not quite strong enough to be in one of the
Re: [Declude.JunkMail] Sniffer IP Reputation for "white" listing
On 4/30/2010 9:32 PM, Andy Schmidt wrote: But your documentation of the reputation system has a graph that shows that there is yet another category: "WHITE". I don't know the details of Declude's impelementation. Presumably they could (or maybe even do) implement WHITE. The SNFIPREP tests does offer the ability to define at what decimal value (between -1 and +1, in .1 increments) a weight can be subtracted. But the question is - is that SENSIBLE use of your reputation database? Per example, could -0.8 be a sensible threshold to give an email "credit" for coming from a reputable IP source? I'm guessing on how that test is implemented, but if I've guessed correctly then -0.8 would certainly be a good WHITE set point. My guess is based on using a combined score value from the IP reputation that combines the confidence figure and the probability figure. In that case only a strongly negative p coupled with a strong c would result in a -0.8. Or is it better to let the "good" reputation be considered AFTER the content scan and then use the "combined" exit code? As I understand it Declude uses a wheighting system --- except for some short-circuit abilities that means all tests are run, their scores are added together, and then the total is used to determine the disposition of the message. I don't think there is an 'AFTER' in this case. The IP reputation test is useful in cases where a message might be too new to hit a pattern match and where the IP reputation is not quite strong enough to be in one of the GBUdb envelopes. In such a case it might be useful to combine the 'analog' reputation score with the scores from other tests to push the message over the fence one way or another... at least that's how the test was designed to work in the API we provide. It sounds like you're describing the IP Reputation test as having thresholds. That's an interesting way to do it (I haven't looked to see if it is actually that way)... a better way to do it would be to scale the result so that from 0 to -1 the "negative" weight (let's pick a factor of 5) would rise linearly from 0 to -5 and similarly a positive going reputation would scale linearly from 0 to +5 as the API result scaled from 0 to +1. The API result holds 0 as meaning "I don't know" --- either because the confidence figure (c) is 0 or because the probability figure (p) is 0 (meaning a 50% chance of spam or ham). The farther away from 0 you get the more certain the statistics. Hope this helps, _M --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to imail...@declude.com, and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.
RE: [Declude.JunkMail] Sniffer IP Reputation for "white" listing
Hi Pete, Other question. The SNFIP tests return Caution or Black or Caution. And the SNF client exit codes also have Truncate/Black. But your documentation of the reputation system has a graph that shows that there is yet another category: "WHITE". I don't see this represented as an SNFIP or SNF rule? Any reason why "WHITE" was left out? The SNFIPREP tests does offer the ability to define at what decimal value (between -1 and +1, in .1 increments) a weight can be subtracted. But the question is - is that SENSIBLE use of your reputation database? Per example, could -0.8 be a sensible threshold to give an email "credit" for coming from a reputable IP source? Or is it better to let the "good" reputation be considered AFTER the content scan and then use the "combined" exit code? -Original Message- From: supp...@declude.com [mailto:supp...@declude.com] On Behalf Of Pete McNeil Sent: Friday, April 30, 2010 7:07 PM To: declude.junkmail@declude.com Subject: Re: [Declude.JunkMail] Sniffer IP vs. Sniffer IP Reputation vs. Sniffer Truncate On 4/30/2010 5:16 PM, Andy Schmidt wrote: > Hi Pete, > > I'm look over Decludes recommended Sniffer configuration and trying to > understand how much overlap there is between these options: > > IPREPUTATION SNFIPREPx 0 10 -5 > > SNFIPCAUTION SNFIP x 4 5 0 > SNFIPBLACKSNFIP x 5 10 > 0 > SNFIPTRUNCATE SNFIP x 6 10 0 > > SNFTRUNCATE SNF x 20 10 > 0 > SNIFFER-IP-RULES SNF x 63 10 > 0 > > Looking at the Sniffer documentation IP test result codes > http://www.armresearch.com/support/articles/software/snfClient/resultCodes.j > sp > it seems that the SNFIP tests for "4", "5" and "6" (SNFIPCAUTION, > SNFIPBLACK, SNFIPTRUNCATE) might coincide with 40, 63 and 20. > I am not intimately familiar with Declude's configuration and SNF integration --- not like I used to be anyway (s many platforms now). I _think_ these tests work like this: The SNFIPREP test gives you a variable weight based on the IP reputation in GBUdb. This allows you to get some weighting positively or negatively based on the reputation even when that reputation is not in one of the defined GBUdb envelopes. It's a subtle nudge in the right direction. The SNFIP test gives you a hard result code based only on the IP reputation when that reputation is within one of the envelopes defined for GBUdb. So if the IP reputation is in the Caution, Black, or Truncate range then that test will fire. Presumably all of the IP tests happen before SNF scans the message -- because they can -- I don't know that they do, but I know that IP reputations can be queried before and separately from a scan. (Scans MUST happen in order for GBUdb to build up reputation data however). Finally the SNF test responds to the normal blended result codes that SNFClient would return. So result code 20 is Truncate- meaning that the IP reputation was so bad that SNF stopped the scan and returned the result code. Result code 63 is Black which could mean that an SNF IP rule fired (rare these days) or that no pattern matched but the IP was in the Black range in GBUdb so GBUdb took over and forced the result code from 0 (no pattern found) to 63 (Black). Other result codes are also possible: http://www.armresearch.com/support/articles/software/snfClient/resultCodes.j sp#msgScan David -- if I got any of this wrong please correct me. > However, Declude ALSO tests for your Rule Group Result Codes "20" and "63" > which are documented here: > http://www.armresearch.com/support/articles/software/snfServer/core.jsp > > 1. It seems to me, as if their SNFTRUNCATE is the same as their > SNFIPTRUNCATE, and their SNIFFER-IP-RULES is the same as their SNFIPBLACK -- > effectively artificially inflating (doubling) the weights for these tests? > Yes -- if you have them configured that way. Some of the results are predictable. If SNFIP is Black or Caution then you are virutally guaranteed to get a Black or Caution result from SNF -- Unless SNF matches a pattern in which case you will get a pattern result code from the SNF test. If SNFIP is Truncate then SNF should also return Truncate. The weights you assign to these should be set accordingly. > 2. How do those Caution/Black/Truncate exit codes relate to SNFIPREP. > There, any reputation> 0 (up to 1) is given an extra weight of 10. But > doesn't SNFIPREP report from the same reputation data as the SNFIP (and > possibly even group result codes 20 and 63)? In other words, are those IP > addresses that generate a reputation factor of> 0 ALSO reported as > Caution/Black or Truncate - if so, we'd now TRIPLE count that score. > That's not quite true... I presume the SNFIPREP test uses a sliding numeric val