On Feb 16, 2015, at 11:47 AM, Kevin A. McGrail <kmcgr...@pccc.com> wrote:
> I'm happy to look at a recent sample and throw it through my system to see > what it hits but overall, I've been seeing the exact opposite. So, one of my users has been getting dozens (sometimes nearly 100) FNs per DAY over the last few weeks. Even though many of these emails are hitting BAYES_999, they are not hitting any other non-negligible scoring rules. I have set BAYES_99 + BAYES_999 to a combined score of 4.9 because I don't want it to be a complete poison pill, but this is contributing to something like 50% of the FNs (where only BAYES_999 is contributing to the score because no other rules are hitting). The other 50% are not getting high-enough Bayes scores, but even then, many still don't hit many (or any) other scoring rules so that they would still have this problem even if they scored BAYES_999. In many cases, it would appear that he is getting a "fresh batch" that hasn't yet hit the RBLs or hash DBs, which is why even with BAYES_999 they don't score over the 5.0 threshold... it's causing some severe inbox unpleasantness. I've been trying to come up with some good URI template rules to block many of these but spammers are getting sufficiently generic in their URIs that I worry strongly about FPs for these. I haven't been able to identify any other distinctive markers in the template against which I can reliably write rules, although I also don't have a program that does strong comparisons to look for patterns (I'm just doing this by eye). I have his spam corpus of a few thousand messages... simple Bayes training doesn't seem to help, so some sort of template matching would really be useful here, but as I said, I haven't really found anything that I feel comfortable writing rules against without significant risk of FPs. Might anyone have some ideas? This is getting to be a serious issue for this user and I'm getting complaints... Thanks. (For reference: running SA 3.4.0 on CentOS 5.11.) --- Amir