https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6188
--- Comment #5 from Warren Togami <[email protected]> 2009-08-31 22:14:52 PST --- rawbody __OBFUSCATING_COMMENT_A /\w(?:<![^>]*>)+\w/ \w is problematic in languages like Japanese where there is not necessarily any whitespace between words. Also multi-byte space could be encoded but we'll never know it is whitespace since we don't decode by default in spamassassin. So it might be impossible to fix this rule for encoded languages. Why? * spamassassin does not decode by default. Most rules work just fine without decoding. Optional decoding makes it WAY slower which is why we don't do it. * Even if decoding were not a problem, linguistically the whitespace assumptions used in the test cases for this rule are invalid for languages like Japanese without spaces between words in a sentence. What should we do if __OBFUSCATING_COMMENT_A cannot be fixed? This could make spamassassin dangerous for sizable populations of Asian users we don't test at all in the corpus like Chinese. Should we apply the above patch fixing __OBFUSCATING_COMMENT_B at least? -- Configure bugmail: https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug.
