>-----Original Message----- >From: Robert Menschel [mailto:[EMAIL PROTECTED] >Sent: Wednesday, December 01, 2004 9:56 AM >To: Johnson, Robert F; users@spamassassin.apache.org >Subject: Re[2]: Japanese False Postives with Spam Assassin 3.01 and RH WS >3.0 > >Hello Robert, > >Tuesday, November 30, 2004, 9:25:52 PM, Daniel wrote: > >DQ> The problem doesn't sound like it's SpamAssassin despite the subject >DQ> line of this email, rather it's third-party rulesets. > >I agree. > >DQ> "Johnson, Robert F" <[EMAIL PROTECTED]> writes: > >>> Based on spt checking of a couple of dozen examples, I didn't see any >>> significant pattern of out of the box rules being involved, mostly SARE >>> or WIKI rules. The most heavily implicated were the following: >>> (MANGLED and SARE_SUB_CASH_CHAR were probably had the biggest impact. >>> >>> SARE Rules >>> SARE_SUB_CASH_CHAR >>> SARE_RAND_2 > >Can you email a couple of examples to me that hit these rules to me, >preferably in a zip or gz file? I maintain the Subject rules file for >SARE, and would like to refine/rescore SARE_SUB_CASH_CHAR to help >avoid your FPs. I'll also forward the info to the SARE ninja that >maintains our Random rules file. > >>> WIKI Rules >>> MANGLED_LIST >>> MANGLED_LIPS >>> J_CHICKENPOX_12 >>> J_CHICKENPOX_22 > >All of these are language-related rules, which work well in English, >might be subject to an occasional misfire in a non-English Western >European language, and can readily misfire in any >non-Latin/non-Romance language. If you regularly get non-spam in >Japanese, you should probably drop the entire MANGLED and CHICKENPOX >families. If you're using Tripwire, you should drop that also since it >too can misfire on Japanese non-spam. > >Bob Menschel > >
[Johnson, Robert F] Bob, Thanks for the reply. I will try to get some example for your analysis. I may have to attempt a repro of the issue. I will let you know soon. Could the SARE team provide a guideline regarding the best SARE and WIKI rules sets to work in an environment that supports the following languages? Maybe some sort of a local language compatibility matrix would be useful to many users. I would be happy to help put that together in any way I could. Japanese, Korean, traditional and simplified Chinese, English, assorted European. Regards, Rob