Here's a lovely edge case...

I've got someone who posted text from MS Office into an email (wish I could ban that). The text contained a numbered list. The fourth list item started with "Date & Time". The 4 and following period were in a span element with a margin to separate it from the text but no actual whitespace, so the plain text version comes up as (I've used {dot} to avoid another trigger) "4{dot}Date & Time". This then triggered :

  2.0 PDS_OTHER_BAD_TLD      Untrustworthy TLDs [URI: 4{dot}date (date)]
  5.0 KAM_SOMETLD_ARE_BAD_TLD .stream, .trade, .pw, .top, .press, .bid & .date 
TLD Abuse

Thus consigning a meeting agenda to the trash. I suspect this is an uncommon but not rare false positive.

These rules would benefit from excluding single character domain matches (which IIRC would be invalid domains anyway). A this sort of FP would be avoided. For bonus points excluding three-character roman numerals under 10 (iii, vii, etc.) would be useful too.

--
For SpamAssassin Users List

Reply via email to