Parsing of undecoded UTF-8 will give garbage when decoding entities at /opt/csw/share/perl/csw/Mail/SpamAssassin/HTML.pm line 182.
Attached is an example, which is ham. The HTML::Parser man page says something about passing utf8 to p->parse, or some such, but I do not understand what this means. Is there a patch to SA to fix this?
If it matters, here is my locale settings. I tried with LC_ALL=C and that did not help.
[EMAIL PROTECTED] tmp]# locale LANG= LC_CTYPE="C" LC_NUMERIC="C" LC_TIME="C" LC_COLLATE="C" LC_MONETARY="C" LC_MESSAGES="C" LC_ALL= Thanks, Alex
sa_learn_error.gz
Description: GNU Zip compressed data