>-----Original Message----- >From: alan premselaar [mailto:[EMAIL PROTECTED] >Sent: Tuesday, November 30, 2004 5:55 PM >To: Johnson, Robert F >Cc: users@spamassassin.apache.org >Subject: Re: Japanese False Postives with Spam Assassin 3.01 and RH WS 3.0 > >Johnson, Robert F wrote: >> Hi, >> >> I have been having a high occurrence of Japanese false positives since >> upgrading from Spam Assassin 2.64 on RedHat 7.3 with MimeDefang 2.31 to >> Spam Assassin 3.01 on RedHat Workstation 3.0 installed site wide via >> MimeDefang 2.44. I am wondering if this is due to the problem with Red >> Hat 9.0 Unicode UTF-8. I had no issues with Japanese false positives in >> the RH 7.3 based environment. >> >> I've a few articles regarding this issue, but need some help >> understanding correct LANG configurations for Spam Assassin 3.01 on >> RedHat Workstation 3.0 installed site wide via MimeDefang 2.44. >> >> I currently have the following set in /etc/sysconfig/ i18n: ( we are US >> based) >> >> LANG="en_US" >> SUPPORTED="en_US" >> >> I compiled Spam Assassin from tar ball with LANG set to en_US (export >> LANG=en_US). Are these settings correct? Could this be causing the >> Japanese false positives? >> >> Are there any other known issues that can cause Japanese false positives >> using Spam Assassin 3.01? >> >> Thanks for any help! >> >> Rob >> >> >Rob, > > just a couple obvious questions. what are your ok_locales and >ok_languages settings in your sa-mimedefang.cf file set to? > >what rules are the japanese emails hitting when they're tagged as false >positives? > >I'm based in Japan, just recently upgraded to SA 3.01 with MD 2.49 and >using a MySQL based bayes database and I've been noticing some >quirkiness with Japanese email as well, but haven't really pinned it >down yet. > >alan
[Johnson, Robert F] Thanks for your reply. I had ok_locales set to all but didn't have ok_languages explicitly set. I think that is ok since the default value is supposed to be all. Based on spt checking of a couple of dozen examples, I didn't see any significant pattern of out of the box rules being involved, mostly SARE or WIKI rules. The most heavily implicated were the following: (MANGLED and SARE_SUB_CASH_CHAR were probably had the biggest impact. SARE Rules SARE_SUB_CASH_CHAR SARE_RAND_2 WIKI Rules MANGLED_LIST MANGLED_LIPS J_CHICKENPOX_12 J_CHICKENPOX_22 HTML_BACKHAIR_4 Out of the Box: GAPPY_SUBJECT FREE_SAMPLE OBSCURED_EMAIL Rob