http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4636





------- Additional Comments From [EMAIL PROTECTED]  2006-01-12 06:08 -------
(In reply to comment #28)
> FWIW, it's not good to have that level of differing behaviour, based on 
> whether
> a dependency module is installed or not; I'd prefer to have another way of
> turning this functionality on or off.  (whether that's manually enabled in
> config, or inferred from the mail traffic or environment is another question.)

It may sound odd but option is a option.  That is, option is to reflect user's
preference, choice.  For those who live in asian countries, multi-bytes handling
is not special.  Charset normalization should be transparent to user if
requirements meet.  I hope not to create option for this operation.

I think there should be an option if we support UTF-8 aware regexp because there
will be significant performance loss.  This will be a user's choice.  Charset
normalization is not a user's choice for Jepanese and other asian people.

if $normalize_supported in Node.pm is evaluated so many times it would be a
problem.  But if it is eavluated only once or several times, I think it's OK.

(In reply to comment #21)
> - spam patterns are generally encoded in a limited number of character sets

Recently Japanese spam is increasing.  Charset frequently used is iso-2022-jp,
shift-jis.  There seems to be poor mass mailer.  Sometimes
Content-type/Content-transfer-encoding missing.  Sometimes header encoding is
wrong; embedded charset name is iso-2022-jp whereas actual charset is shit-jis,
for example.

> - therefore, catch rates do not increase with recoding (if anything, they are
>   quite likely to decrease due to spam tricks causing us to pick the wrong
>   character set)

I only have a few days experience, but most Japanese hams now have BAYES_00 to
BAYES_50 and most spams now have BAYES_99.  Most significant example is that
bayse score for ham sample changes BAYES_99 (non-patched) to BAYES_00 
(patched)!! 
In general, bayes score for spam is not changed, but ham's score are decreased.

Word match becomes perfect so we will be able to maintain language specific
rules easily.  In addition, special note in user_prefs.template will not be
necessary, although I have not tested yet.




------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.

Reply via email to