that old message I was talking about.
---------- Forwarded message ---------- From: Daniel Quinlan <quin...@pathname.com> Date: Sat, May 22, 2004 at 16:25 Subject: DNSBL accuracy using -firsttrusted To: spamassassin-...@incubator.apache.org Someone at Spamhaus poked me to try testing only the last IP address with XBL and I tested it and it helps reduce false positives quite nicely. The concept with XBL is that if it came most recently from an okay host, then the message is probably okay too. It's a bit spooky but it works and I suppose it is closer in behavior to how blacklists are generally used at connect time, so perhaps most are tuned to be used this way. The main caveat is that if trusted networks is not guessed or set correctly, then *no* blacklist hits will happen and the net score set will be used to the detriment of the site. I tried the same idea on more or less every applicable blacklist and check out the results: ------- start of cut text -------------- OVERALL% SPAM% HAM% S/O RANK SCORE NAME 29979 14999 14980 0.500 0.00 0.00 (all messages) 100.000 50.0317 49.9683 0.500 0.00 0.00 (all messages as %) 12.212 24.4083 0.0000 1.000 1.00 0.01 T_RCVD_IN_NJABL_PROXY 12.962 25.7951 0.1135 0.996 0.57 0.00 RCVD_IN_NJABL_PROXY 18.186 36.3291 0.0200 0.999 0.95 1.00 __T_RCVD_IN_NJABL 19.877 38.1225 1.6088 0.960 0.30 1.00 __RCVD_IN_NJABL 8.613 17.2145 0.0000 1.000 0.91 0.01 T_RCVD_IN_SORBS_MISC 9.136 18.2412 0.0200 0.999 0.80 0.00 RCVD_IN_SORBS_MISC 29.124 58.1705 0.0401 0.999 0.90 0.01 T_RCVD_IN_DSBL 30.395 60.2640 0.4873 0.992 0.43 0.00 RCVD_IN_DSBL 7.966 15.9211 0.0000 1.000 0.87 0.01 T_RCVD_IN_SORBS_HTTP 8.449 16.8011 0.0868 0.995 0.49 0.00 RCVD_IN_SORBS_HTTP 5.337 10.6540 0.0134 0.999 0.74 0.01 T_RCVD_IN_RFCI 7.162 12.3675 1.9493 0.864 0.00 0.00 RCVD_IN_RFCI 9.804 19.5613 0.0334 0.998 0.73 0.01 T_RCVD_IN_SBL 9.927 19.7747 0.0668 0.997 0.62 0.00 RCVD_IN_SBL 14.610 29.1486 0.0534 0.998 0.73 1.00 __T_RCVD_IN_SBL_XBL 15.044 29.7820 0.2870 0.990 0.35 1.00 __RCVD_IN_SBL_XBL 3.116 6.2204 0.0067 0.999 0.72 0.00 RCVD_IN_NJABL_SPAM 3.062 6.1137 0.0067 0.999 0.70 0.01 T_RCVD_IN_NJABL_SPAM 2.055 4.1069 0.0000 1.000 0.66 0.01 T_RCVD_IN_BL_SPAMCOP_NET 2.235 4.3070 0.1602 0.964 0.14 0.00 RCVD_IN_BL_SPAMCOP_NET 5.100 10.1740 0.0200 0.998 0.65 0.01 T_RCVD_IN_XBL 5.417 10.6074 0.2203 0.980 0.18 0.00 RCVD_IN_XBL 21.869 43.5562 0.1535 0.996 0.64 0.01 T_RCVD_IN_SORBS_DUL 22.146 44.0363 0.2270 0.995 0.48 0.00 RCVD_IN_SORBS_DUL 34.071 67.9112 0.1869 0.997 0.63 1.00 __T_RCVD_IN_SORBS 42.410 70.9047 13.8785 0.836 0.34 1.00 __RCVD_IN_SORBS 1.868 3.7336 0.0000 1.000 0.64 0.00 RCVD_IN_SORBS_SMTP 1.731 3.4602 0.0000 1.000 0.62 0.01 T_RCVD_IN_SORBS_SMTP 2.935 5.8537 0.0134 0.998 0.63 0.00 RCVD_IN_NJABL_DIALUP 2.879 5.7404 0.0134 0.998 0.61 0.01 T_RCVD_IN_NJABL_DIALUP 0.934 1.8668 0.0000 1.000 0.57 0.01 T_RCVD_IN_RSL 1.041 2.0735 0.0067 0.997 0.55 0.00 RCVD_IN_RSL 0.607 1.2134 0.0000 1.000 0.53 0.01 T_RCVD_IN_SORBS_SOCKS 0.637 1.2401 0.0334 0.974 0.33 0.00 RCVD_IN_SORBS_SOCKS 0.430 0.8601 0.0000 1.000 0.49 0.01 T_RCVD_IN_SORBS_WEB 0.447 0.8867 0.0067 0.993 0.46 0.00 RCVD_IN_SORBS_WEB 0.254 0.5067 0.0000 1.000 0.44 0.01 T_RCVD_IN_SORBS_ZOMBIE 0.307 0.5867 0.0267 0.956 0.29 0.00 RCVD_IN_SORBS_ZOMBIE 0.117 0.2333 0.0000 1.000 0.42 0.00 RCVD_IN_NJABL_RELAY 0.113 0.2267 0.0000 1.000 0.40 0.01 T_RCVD_IN_NJABL_RELAY ------- end ---------------------------- change in RANK (relative to just the IP-based blacklists and the new -firsttrusted ones in testing) 0.74 RCVD_IN_RFCI 0.52 RCVD_IN_BL_SPAMCOP_NET 0.47 RCVD_IN_XBL 0.47 RCVD_IN_DSBL 0.43 RCVD_IN_NJABL_PROXY 0.38 RCVD_IN_SORBS_HTTP 0.20 RCVD_IN_SORBS_SOCKS 0.16 RCVD_IN_SORBS_DUL 0.15 RCVD_IN_SORBS_ZOMBIE 0.11 RCVD_IN_SORBS_MISC 0.11 RCVD_IN_SBL 0.03 RCVD_IN_SORBS_WEB 0.02 RCVD_IN_RSL -0.02 RCVD_IN_NJABL_DIALUP -0.02 RCVD_IN_NJABL_RELAY -0.02 RCVD_IN_NJABL_SPAM -0.02 RCVD_IN_SORBS_SMTP and not really relevant unless we change entire sets to reduce the number of look-ups: 0.65 __RCVD_IN_NJABL 0.38 __RCVD_IN_SBL_XBL 0.29 __RCVD_IN_SORBS Results for some fresh mail that may still have a few misfiles: ------- start of cut text -------------- OVERALL% SPAM% HAM% S/O RANK SCORE NAME 4039 2294 1745 0.568 0.00 0.00 (all messages) 100.000 56.7962 43.2038 0.568 0.00 0.00 (all messages as %) 48.651 85.6582 0.0000 1.000 1.00 1.00 __T_RCVD_IN_SBL_XBL 50.012 87.8378 0.2865 0.997 0.68 1.00 __RCVD_IN_SBL_XBL 34.043 59.9390 0.0000 1.000 0.93 0.01 T_RCVD_IN_BL_SPAMCOP_NET 35.058 61.4211 0.4011 0.994 0.55 0.00 RCVD_IN_BL_SPAMCOP_NET 33.028 58.1517 0.0000 1.000 0.90 0.01 T_RCVD_IN_XBL 34.315 60.3313 0.1146 0.998 0.78 0.00 RCVD_IN_XBL 28.571 50.3051 0.0000 1.000 0.86 1.00 __T_RCVD_IN_SORBS 36.172 53.0078 14.0401 0.791 0.11 1.00 __RCVD_IN_SORBS 26.195 46.1203 0.0000 1.000 0.81 0.01 T_RCVD_IN_DSBL 27.309 47.2537 1.0888 0.977 0.32 0.00 RCVD_IN_DSBL 19.064 33.5658 0.0000 1.000 0.78 0.00 RCVD_IN_SORBS_DUL 18.693 32.9119 0.0000 1.000 0.76 0.01 T_RCVD_IN_SORBS_DUL 16.712 29.4246 0.0000 1.000 0.71 0.01 T_RCVD_IN_SBL 16.984 29.7733 0.1719 0.994 0.48 0.00 RCVD_IN_SBL 11.290 19.8779 0.0000 1.000 0.67 1.00 __T_RCVD_IN_NJABL 12.528 20.6190 1.8911 0.916 0.00 1.00 __RCVD_IN_NJABL 7.130 12.5545 0.0000 1.000 0.62 0.01 T_RCVD_IN_NJABL_PROXY 7.576 13.2084 0.1719 0.987 0.38 0.00 RCVD_IN_NJABL_PROXY 5.967 10.5057 0.0000 1.000 0.57 0.01 T_RCVD_IN_RFCI 7.774 12.5545 1.4900 0.894 0.01 0.00 RCVD_IN_RFCI 5.769 10.1569 0.0000 1.000 0.55 0.01 T_RCVD_IN_SORBS_MISC 6.041 10.5929 0.0573 0.995 0.51 0.00 RCVD_IN_SORBS_MISC 4.481 7.8901 0.0000 1.000 0.50 0.01 T_RCVD_IN_SORBS_HTTP 4.828 8.3697 0.1719 0.980 0.26 0.00 RCVD_IN_SORBS_HTTP 2.253 3.9669 0.0000 1.000 0.47 0.01 T_RCVD_IN_NJABL_DIALUP 2.253 3.9669 0.0000 1.000 0.47 0.00 RCVD_IN_NJABL_DIALUP 2.179 3.8361 0.0000 1.000 0.45 0.00 RCVD_IN_RSL 1.956 3.4438 0.0000 1.000 0.43 0.01 T_RCVD_IN_RSL 1.783 3.1386 0.0000 1.000 0.40 0.01 T_RCVD_IN_NJABL_SPAM 1.783 3.1386 0.0000 1.000 0.40 0.00 RCVD_IN_NJABL_SPAM 0.693 1.2206 0.0000 1.000 0.38 0.00 RCVD_IN_SORBS_SOCKS 0.693 1.2206 0.0000 1.000 0.38 0.01 T_RCVD_IN_SORBS_SOCKS 0.619 1.0898 0.0000 1.000 0.35 0.00 RCVD_IN_SORBS_SMTP 0.569 1.0026 0.0000 1.000 0.33 0.01 T_RCVD_IN_SORBS_SMTP 0.545 0.9590 0.0000 1.000 0.31 0.00 RCVD_IN_SORBS_WEB 0.545 0.9590 0.0000 1.000 0.31 0.01 T_RCVD_IN_SORBS_WEB 0.149 0.2616 0.0000 1.000 0.26 0.01 T_RCVD_IN_NJABL_RELAY 0.149 0.2616 0.0000 1.000 0.26 0.00 RCVD_IN_NJABL_RELAY 0.099 0.1744 0.0000 1.000 0.23 0.00 RCVD_IN_SORBS_ZOMBIE 0.074 0.1308 0.0000 1.000 0.21 0.01 T_RCVD_IN_SORBS_ZOMBIE ------- end ---------------------------- 0.56 RCVD_IN_RFCI 0.49 RCVD_IN_DSBL 0.38 RCVD_IN_BL_SPAMCOP_NET 0.24 RCVD_IN_SORBS_HTTP 0.24 RCVD_IN_NJABL_PROXY 0.23 RCVD_IN_SBL 0.12 RCVD_IN_XBL 0.04 RCVD_IN_SORBS_MISC 0.00 RCVD_IN_SORBS_WEB 0.00 RCVD_IN_SORBS_SOCKS 0.00 RCVD_IN_NJABL_SPAM 0.00 RCVD_IN_NJABL_RELAY 0.00 RCVD_IN_NJABL_DIALUP -0.02 RCVD_IN_RSL -0.02 RCVD_IN_SORBS_DUL -0.02 RCVD_IN_SORBS_SMTP -0.02 RCVD_IN_SORBS_ZOMBIE 0.75 __RCVD_IN_SORBS 0.67 __RCVD_IN_NJABL 0.32 __RCVD_IN_SBL_XBL and now Justin's results: ------- start of cut text -------------- OVERALL% SPAM% HAM% S/O RANK SCORE NAME 15995 7997 7998 0.500 0.00 0.00 (all messages) 100.000 49.9969 50.0031 0.500 0.00 0.00 (all messages as %) 11.279 22.5585 0.0000 1.000 0.94 0.01 T_RCVD_IN_NJABL_PROXY 12.035 23.5338 0.5376 0.978 0.77 1.20 RCVD_IN_NJABL_PROXY 16.211 32.4247 0.0000 1.000 0.96 0.00 __T_RCVD_IN_NJABL 19.425 33.8502 5.0013 0.871 0.60 0.00 __RCVD_IN_NJABL 8.234 16.4687 0.0000 1.000 0.91 0.01 T_RCVD_IN_SORBS_MISC 9.015 17.2940 0.7377 0.959 0.71 1.20 RCVD_IN_SORBS_MISC 32.873 65.7497 0.0000 1.000 0.99 0.01 T_RCVD_IN_DSBL 34.117 67.5628 0.6752 0.990 0.79 1.10 RCVD_IN_DSBL 7.984 15.9685 0.0000 1.000 0.90 0.01 T_RCVD_IN_SORBS_HTTP 8.434 16.7063 0.1625 0.990 0.85 1.20 RCVD_IN_SORBS_HTTP 7.265 14.5304 0.0000 1.000 0.90 0.01 T_RCVD_IN_RFCI 8.778 16.5062 1.0503 0.940 0.66 0.10 RCVD_IN_RFCI 2.738 5.4771 0.0000 1.000 0.80 0.01 T_RCVD_IN_SBL 3.026 5.9772 0.0750 0.988 0.78 1.27 RCVD_IN_SBL 39.162 78.3294 0.0000 1.000 1.00 0.00 __T_RCVD_IN_SBL_XBL 40.913 80.7428 1.0878 0.987 0.75 0.00 __RCVD_IN_SBL_XBL 0.863 1.7256 0.0000 1.000 0.63 0.01 T_RCVD_IN_NJABL_SPAM 0.913 1.7757 0.0500 0.973 0.62 0.74 RCVD_IN_NJABL_SPAM 11.597 23.1962 0.0000 1.000 0.94 0.01 T_RCVD_IN_BL_SPAMCOP_NET 12.085 23.8089 0.3626 0.985 0.83 2.25 RCVD_IN_BL_SPAMCOP_NET 37.149 74.3029 0.0000 1.000 0.99 0.01 T_RCVD_IN_XBL 38.700 76.3911 1.0128 0.987 0.76 1.00 RCVD_IN_XBL 23.370 46.7425 0.0000 1.000 0.97 0.01 T_RCVD_IN_SORBS_DUL 23.651 46.9926 0.3126 0.993 0.87 2.55 RCVD_IN_SORBS_DUL 33.117 66.2373 0.0000 1.000 0.99 0.00 __T_RCVD_IN_SORBS 49.722 69.5011 29.9450 0.699 0.54 0.00 __RCVD_IN_SORBS 2.588 5.1519 0.0250 0.995 0.78 1.20 RCVD_IN_SORBS_SMTP 2.501 5.0019 0.0000 1.000 0.78 0.01 T_RCVD_IN_SORBS_SMTP 3.532 7.0651 0.0000 1.000 0.83 0.62 RCVD_IN_NJABL_DIALUP 3.482 6.9651 0.0000 1.000 0.83 0.01 T_RCVD_IN_NJABL_DIALUP 4.020 8.0405 0.0000 1.000 0.85 0.53 RCVD_IN_RSL 3.732 7.4653 0.0000 1.000 0.84 0.01 T_RCVD_IN_RSL 0.706 1.4130 0.0000 1.000 0.61 0.01 T_RCVD_IN_SORBS_SOCKS 0.706 1.4130 0.0000 1.000 0.61 1.62 RCVD_IN_SORBS_SOCKS 0.688 1.3755 0.0000 1.000 0.61 0.01 T_RCVD_IN_SORBS_WEB 0.719 1.4255 0.0125 0.991 0.60 2.90 RCVD_IN_SORBS_WEB 0.088 0.1751 0.0000 1.000 0.50 2.70 RCVD_IN_SORBS_ZOMBIE 0.025 0.0500 0.0000 1.000 0.49 0.01 T_RCVD_IN_SORBS_ZOMBIE 0.688 1.3755 0.0000 1.000 0.61 0.01 T_RCVD_IN_NJABL_RELAY 0.719 1.4005 0.0375 0.974 0.59 1.41 RCVD_IN_NJABL_RELAY 0.000 0.0000 0.0000 0.500 0.49 0.10 RCVD_IN_NJABL_MULTI 0.000 0.0000 0.0000 0.500 0.49 0.01 T_RCVD_IN_NJABL_MULTI 0.000 0.0000 0.0000 0.500 0.49 0.01 T_RCVD_IN_SORBS_BLOCK 0.000 0.0000 0.0000 0.500 0.49 0.00 RCVD_IN_SORBS_BLOCK 0.000 0.0000 0.0000 0.500 0.49 0.01 T_RCVD_IN_NJABL_CGI 0.000 0.0000 0.0000 0.500 0.49 0.10 RCVD_IN_NJABL_CGI ------- end ---------------------------- This RANK relative to all tests since this is from last night's run: 0.24 RCVD_IN_RFCI 0.23 RCVD_IN_XBL 0.20 RCVD_IN_SORBS_MISC 0.20 RCVD_IN_DSBL 0.17 RCVD_IN_NJABL_PROXY 0.11 RCVD_IN_BL_SPAMCOP_NET 0.10 RCVD_IN_SORBS_DUL 0.05 RCVD_IN_SORBS_HTTP 0.02 RCVD_IN_SBL 0.02 RCVD_IN_NJABL_RELAY 0.01 RCVD_IN_SORBS_WEB 0.01 RCVD_IN_NJABL_SPAM 0.00 RCVD_IN_SORBS_SOCKS 0.00 RCVD_IN_SORBS_SMTP 0.00 RCVD_IN_SORBS_BLOCK 0.00 RCVD_IN_NJABL_MULTI 0.00 RCVD_IN_NJABL_DIALUP 0.00 RCVD_IN_NJABL_CGI -0.01 RCVD_IN_RSL -0.01 RCVD_IN_SORBS_ZOMBIE 0.45 __RCVD_IN_SORBS 0.36 __RCVD_IN_NJABL 0.25 __RCVD_IN_SBL_XBL It seems like a huge improvement across the board. There are a few small slips, but do we care? -- Daniel Quinlan http://www.pathname.com/~quinlan/