On Mon, Sep 28, 2009 at 15:33, Warren Togami <[email protected]> wrote:
> On 09/28/2009 01:44 AM, Warren Togami wrote: > >> >> Use RCVD_IN_PSBL_2WEEKS to assign a score. RCVD_IN_PSBL_DEEP would be te >> equivalent to RCVD_IN_PSBL_2WEEKS. The stricter RCVD_IN_PSBL would be a >> subrule that matches only with last-external, thereby being stricter and >> eliminating most of the already mininuscule chance of false positives. >> Thus the full score of RCVD_IN_PSBL_2WEEKS would be split into two parts. >> >> Before >> RCVD_IN_PSBL_2WEEKS score 2 >> This rule does deep parsing which is often good, but sometimes bad. >> >> After >> RCVD_IN_PSBL score 2 >> This rule matces only last-external making it safer from FP's. >> RCVD_IN_PSBL_DEEP score -1 >> This rule is can be scored separately, subtracting a tiny amount if the >> PSBL hit was found in deep parsing. Both rules would trigger, one adds, >> the second subtracts. The subtracting rule would never fire on its own. >> > > OK, the above "subtract" probably needs some explanation. > > This came from a feeling of discomfort with deep parsing of PSBL. PSBL > *is* working well in masscheck with deep parsing with very few FP's. The > trouble is these FP's like sending an e-mail from wireless broadband card > via a legitimate mail server is legitimately blacklisted. Even though this > alone is not likely to cause their mail to be classified as spam with the > default threshold of 5, there is nothing the user can do about the previous > user of that IP having sent spam. > > For this reason I think we should have used psbl-lastexternal. > psbl-lastexternal is extra certain to be correct and deserves a high score. > [1] Deep parsing however has shown to be mostly correct and probably > deserves a smaller score in cases where psbl-lastexternal didn't hit. Can > spamassassin do separate sub-rule matches of lastexternal and deep parsing > without querying twice? > > [1] We are still hitting some yahoo FP's because filtering out Yahoo from > the blacklist was broken until a few days ago. These should disappear > entirely by the two week timeout. I agree we should have used lastexternal. we can do the 'subtract' trick but I'd prefer to do it by simply splitting the rules into a RCVD_IN_PSBL_LASTEXTERNAL (score 2) and RCVD_IN_PSBL_DEEP (score 1), possibly using metas, so that users don't see a confusingly negative score hitting on spam -- principle of least surprise and all that. -- --j.
