John Tolmachoff (Lists) wrote:
I'll add these to the list that I maintain as well for both the
ANTIGIBBERISHSUB and ANTIGIBBERISH filters.  You shouldn't need to add
these to the base filters though since the two letter string will trip
it without any assistance.
    

I did not want to loose the QS, so I added the legit use of it.
  

I just wanted to point out that you would not in fact lose the QS hit the way I suggested because it will hit QS in the main file and then get credit back for QS-9000 in the anti file.  Putting QS-9000 in the main file is redundant with the two letter strings that appear there.

Listing in both the main and anti files is only necessary for things that won't always trip the test but need counterbalancing.  So with strings such as QOS and QS-9000, you only need to add those to the anti file, while something like "parts" won't always trip the main filter and it therefore needs to be added in both places.

Regarding the parts exclusion, what I need to do is figure out the universe of ways that such lists are generally referred to.  It might be that additional exclusions would also be recommended.  I've got over 200 MB worth of hits (scoring and non-scoring), to test for variations.

FYI, I've been toying with the idea of providing exclusions for any body URL that contains a script extension with arguments since it appears that a moderate number of legit automated mailers don't stick to more common acronyms and strings based on decimals or hexadecimals.  I don't want to defeat the test though for every URL though because a good number of hits come by way of spammers inserting gibberish into their links, and that  is much needed IMO.  Again, this is something that I need to test once I separate the FP's from non-scoring hits and real positives in my capture.

Matt


Reply via email to