A NOTE has been added to this issue. ====================================================================== http://www.dbmail.org/mantis/view.php?id=548 ====================================================================== Reported By: idk Assigned To: ====================================================================== Project: DBMail Issue ID: 548 Category: IMAP daemon Reproducibility: N/A Severity: feature Priority: normal Status: new target: ====================================================================== Date Submitted: 22-Mar-07 11:23 CET Last Modified: 22-Mar-07 23:48 CET ====================================================================== Summary: WISH: Better parsing 8bit header characters Description: In mail header values there are valid only 7bit characters, so accents should be escaped. But... Seldom I got message from buggy mail client which ignore this rule.
MSOE's message list has invalid subject (it seems like UTF8 encodings but displayed by single byte), but opened message has Subject header displayed correctly (parsed from headers part of message). So I think it has a solution. MSOE under Windows (CZE) has default code page 1250, so this is one option MSOE interpreted Subject from all message content "correctly", other one is fetching of Content-Type header value (see Additional Information). The second option should be applicable for DBMail, I mean. ====================================================================== Relationships ID Summary ---------------------------------------------------------------------- related to 0000538 incorrect field cache values for messag... ====================================================================== ---------------------------------------------------------------------- paul - 22-Mar-07 14:42 ---------------------------------------------------------------------- This is exactly how it's done at the moment. If a header is 8bit the header string is converted to utf8. If the content-type header contains a charset specification dbmail will try to convert from the specified charset to utf8 Else dbmail will fall back to the charset specified in the DEFAULT_MSG_ENCODING config value and try to convert the string to utf8, assuming the header was encoded in that charset. If both fail dbmail will replace all 8 bit characters with '?'. ---------------------------------------------------------------------- idk - 22-Mar-07 16:25 ---------------------------------------------------------------------- mysql> SELECT HEX(SUBSTRING(messageblk, 1087, 53)) FROM dbmail_messageblks WHERE physmessage_id = 273400 AND is_header = 1; 5375626A6563743A 20 566964656F70726F686C ED 646B61 20 76656C6574726875 20 72796261 F8 656E ED 20 76 20 42726E EC 20 32303037 (added spaces around a \x20 and >\x7F chars) mysql> SELECT SUBSTRING(messageblk, 1087, 53) FROM dbmail_messageblks WHERE physmessage_id = 273400 AND is_header = 1; Subject: Videoprohl?dka veletrhu ryba?en? v Brn? 2007 A001 UID FETCH 554133 (ENVELOPE) * 97 FETCH (UID 554133 ENVELOPE ("Wed, 21 Mar 2007 18:09:41 +0100" "=?UTF-8?q?Videoprohl=C3=ADdka_veletrhu_ryba=C5=99en=C3=AD_?= =?iso-8859-2?q?v_Brn=EC?= 2007" ((NIL NIL "chytej" "chytej.cz")) ((NIL NIL "chytej" "chytej.cz")) ((NIL NIL "chytej" "chytej.cz")) ((NIL NIL "undisclosed-recipients" NIL)) NIL NIL NIL "<[EMAIL PROTECTED]>")) A001 OK UID FETCH completed It seems ok, because UTF(C3 AD) == WIN(ED), UTF(C5 99) == WIN(F8), ISO(EC) = WIN(EC). Do you mean bug is in MSOE mail client? Does MSOE recognize a =?UTF-8?q? prefix? Or mixed UTF8 and ISO 8859-2? I'll attach screenshots of this situation. Red underlining highlites wrong characters and green "correct" (at msoe.jpg you could see of font change from this position to the end of line, incl. 2007 number, but it seems like MSOE bug, squirrel (SquirrelMail 1.4.10 SVN) shows both wrong). (Note for http://www.dbmail.org/mantis/view.php?id=538: I have 2471 revision, default_msg_encoding=utf8.) ---------------------------------------------------------------------- paul - 22-Mar-07 16:56 ---------------------------------------------------------------------- Now why are you using default_msg_encoding=utf8?? Try using windows-1250 since you mentioned that is the charset that's causing the problems. ---------------------------------------------------------------------- idk - 22-Mar-07 23:48 ---------------------------------------------------------------------- Why am I UTF8 as default? You said me :) In bug http://www.dbmail.org/mantis/view.php?id=265 you wrote: ... you do need to change dbmail.conf and add two new entries: encoding=utf8 default_msg_encoding=utf8 So I did it. Nevertheless I have tried to change to WINDOWS-1250 but with the same result. Regardless of default charset there is inconsistency between cached headervalue (dbmail_headervalue.headervalue TEXT utf8_general_ci) and binary content of all headers (dbmail_messageblks.messageblk LONGBLOB BINARY): mysql> SELECT HEX(headervalue) FROM dbmail_headervalue WHERE id = 607434; 566964656F70726F686CC383C2AD646B612076656C65747268752072796261C385E284A2656EC383C2AD20762042726EC384E280BA2032303037 So V 56 i 69 d 64 e 65 o 6F p 70 r 72 o 6F h 68 l 6C i_acute C383C2AD d 64 k 6B a 61 20 v 76 e 65 l 6C e 65 t 74 r 72 h 68 u 75 20 r 72 y 79 b 62 a 61 r_circ C385E284A2 e 65 n 6E i_acute C383C2AD 20 v 76 20 B 42 r 72 n 6E e_circ C384E280BA 20 2 32 0 30 0 30 7 37 Issue History Date Modified Username Field Change ====================================================================== 22-Mar-07 11:23 idk New Issue 22-Mar-07 14:42 paul Note Added: 0001935 22-Mar-07 14:42 paul Relationship added related to 0000538 22-Mar-07 16:25 idk Note Added: 0001936 22-Mar-07 16:28 idk File Added: msoe.jpg 22-Mar-07 16:28 idk File Added: squirrel.jpg 22-Mar-07 16:56 paul Note Added: 0001937 22-Mar-07 23:48 idk Note Added: 0001939 ====================================================================== _______________________________________________ Dbmail-dev mailing list Dbmail-dev@dbmail.org http://twister.fastxs.net/mailman/listinfo/dbmail-dev