Samuel Thibault kirjoitti (27.12.2006 klo 10.49): > Teemu Likonen, le Wed 27 Dec 2006 10:23:39 +0200, a écrit : > > I didn't know that bogofilter is able to check message headers for > > correct encoding. I use KMail (KDE's email client) and it converts > > messages to locale charset before sending them to bogofilter. How do > > other programs behave? What is the correct behaviour (if there is > > one)? > > I'd say the correct behavior is to just keep the message intact.
I checked how bogofilter works with messages with different encodings and Content-Type headers. Bogofilter works as it should: it checks the message's Content-Type header and get's the charset from there. With "unicode=yes" (which is the default) bogofilter converts the message to UTF-8 and stores words to it's database. If charset is not defined in message's Content-Type headers, bogofilter uses it's own charset_default setting (default is ISO-8859-1). I think ISO-8859-1 is a good default: I believe most of the messages without Content-Type headers are in some kind of Western European charset. Probably most of the spam is English. So, my bug report was pretty pointless from bogofilter's point of view. :) I guess this bug can be closed. At least I downgraded the severity to "normal". There remains this KMail problem, though. Maybe it's worth filing a new report.