-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Daniel Quinlan writes: > Just to play devil's advocate, I have one other question: would it be > cheaper and safer to simply run tests for certain languages using > multiple character sets? > > Cheaper: is the cost really cheaper to convert? > > Safer: what if you guess wrong? what if the character set is hard to > determine correctly (intentially mixed-up, binary inserted, > half-and-half, jumbled character sets, etc.). FWIW, one spammer was doing this deliberately a while ago -- mailing entirely in ASCII characters, but declaring the charset as GB2312 (iirc), and including the odd chinese character towards the end of the mail. Since the latter charset includes the entirety of ASCII 0x00-0x7f, that worked for ASCII spam delivery. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFDB3MkMJF5cimLx9ARAhVBAJ0R6kSFstnlti6MV+16NvPdLLaJowCcCEMq 3LN5Z6PD9H5A4RV7BuPsU9w= =+ipR -----END PGP SIGNATURE-----
