On Jul 4, 2014, at 12:08 AM, haman...@t-online.de wrote: > > Hi, > > while this is certainly not correct - and likely does not display in every > mail client - it would > probably work in several webmailers. Perhaps this is the configuration the > author of that > crap tested. > Now, I am somewhat reluctant to classify badly formatted mails as spam: there > are many > systems around, even from major players, that send legitimate mails like > order confirmation, > delivery notification, opted-in newsletters but do many of the formal things > more right than wrong > On the other side, looking at the actual characters shows that the message is > spam: these are > cyrillic letters that happen to look exactly like western ones (a, e, o or > such) so the obvious intent > is to avoid detection of the strings. We have seen the same with IDN domain > names that might > use a cyrillic a to register a domain that looks like, e.g. paypal.com > The list of characters is fairly short, so maybe checking for these > characters in all commonly > used variants (html entities, utf8 encoded, +u0430, \u0430. IDN encoded) > would be a good > spam indication > > Regards > Wolfgang > >
I think you’re overlooking what a lot of tests already do: test for poor formatting. INVALID_DATE UNPARSEABLE_RELAY HTML_MISSING_CTYPE MISSING_HEADERS MISSING_DATE As for encoding a cyrillic small a: there are many ways to do this. iso-8859-4, utf-8, jp2212, gb2312, win1252, etc. I don’t think this would be very efficient—there are just too many charsets possible. -Philip