A NOTE has been added to this issue. ====================================================================== http://www.dbmail.org/mantis/view.php?id=1038 ====================================================================== Reported By: ALyarskiy Assigned To: ====================================================================== Project: DBMail Issue ID: 1038 Category: General Reproducibility: sometimes Severity: major Priority: normal Status: new target: ====================================================================== Date Submitted: 20-Jan-14 12:42 CET Last Modified: 23-Jan-14 09:43 CET ====================================================================== Summary: UTF8 support. Sortfield generated ignoring multibyte symbols. Description: dm_messages.c: function _header_cache ..... if(issubject) { char *s, *t = dm_base_subject(value); s = dbmail_iconv_str_to_db(t, charset); g_strlcpy(sortfield, s, CACHE_WIDTH-1); g_free(s); g_free(t); } .....
It does not work correctly with multibyte strings. ====================================================================== ---------------------------------------------------------------------- (0003625) paul (administrator) - 20-Jan-14 12:51 http://www.dbmail.org/mantis/view.php?id=1038#c3625 ---------------------------------------------------------------------- Please provide steps to reproduce, or better yet: a patch that is validated to fix this problem by a unit-test that exersizes it. ---------------------------------------------------------------------- (0003626) ALyarskiy (reporter) - 22-Jan-14 08:19 http://www.dbmail.org/mantis/view.php?id=1038#c3626 ---------------------------------------------------------------------- Possible way I see at this moment is to process headers through forced utf8 encoding. Like that: char *s, *t = dm_base_subject(value); s = dbmail_iconv_str_to_utf8(t, charset); ... PROCESSING ... ... if db_encoding != utf8: dbmail_iconv_str_to_db() Other way is to nail database encoding to utf8. DB backends limitations: Oracle supports 4-byte characters Postgres supports 4-byte characters SQLite supports 4-byte characters Mysql default utf8 is 3-byte, but it can handle 4-byte characters since version 5.5.3 by using "utf8mb4" encoding 4 byte characters are supplementary (http://www.i18nguy.com/unicode/supplementary-test.html). So there is possible problem with mysql version prior to 5.5.3, later versions will require db-update to switch from 3-byte to 4-byte (http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html). At this moment I have patch that works with utf8 headers (postgres backend). Will provide it right after some tests. ---------------------------------------------------------------------- (0003627) ALyarskiy (reporter) - 23-Jan-14 09:04 http://www.dbmail.org/mantis/view.php?id=1038#c3627 ---------------------------------------------------------------------- Ok, here is patch. It is my first experience with C, so the patch should be reviewed by a real developer =) Patch includes: 1. New function _header_exists to check if header already exists. It is not really cool to try to insert and check for errors. 2. UTF8 headers support. Assuming max size 4 bytes (some issues with mysql, see previuos note). 3. Added few trace messages. Example header: RAW=[=?koi8-r?Q?[XXXXXX]_[33666]_=FA=C1=D0=D2=CF=D3_=CE=C1_=D4=C5=C8._=D0=CF?= =?koi8-r?B?xMTF0tbL1SAo+sHQ0s/TIMTP0C4gyc7Gz9LNwcPJySksIM7Fy8/S0sXL1M7P?= =?koi8-r?B?xSDazsHexc7JxSDEz9AuIMHU0snC1dTBINPPINrOwd7FzsnFzSDQzyDVzc/M?= =?koi8-r?B?3sHOycAgxMzRINTJ0MEgxM/Hz9fP0sEsIM7FINPX0drBzs7Px88g0yDc1MnN?= =?koi8-r?Q?_=C1=D4=D2=C9=C2=D5=D4=CF=CD?=] ---------------------------------------------------------------------- (0003628) paul (administrator) - 23-Jan-14 09:43 http://www.dbmail.org/mantis/view.php?id=1038#c3628 ---------------------------------------------------------------------- Ok, you're on to something here, but I'm rejecting the patch for following reasons: - please use git-diff to generate the patch, or better yest: fork on github, clone, hack, test, commit, push, and send me a pull request. Since you're working off the 3.1 code that will allow easy forward porting to the master branch. - the patch does way too much. You're fixing a non-existing problem with the new _header_exists function. That case is already well covered in the code. Also, *all* queries *must* happen inside a TRY/CATCH/FINALLY block. - I don't see any unit-tests that demonstrate the problem and the fix: please expand tests/check_dbmail_message.c Issue History Date Modified Username Field Change ====================================================================== 20-Jan-14 12:42 ALyarskiy New Issue 20-Jan-14 12:51 paul Note Added: 0003625 22-Jan-14 08:19 ALyarskiy Note Added: 0003626 23-Jan-14 09:04 ALyarskiy Note Added: 0003627 23-Jan-14 09:04 ALyarskiy File Added: utf8_header.patch 23-Jan-14 09:43 paul Note Added: 0003628 ====================================================================== _______________________________________________ Dbmail-dev mailing list Dbmail-dev@dbmail.org http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev