A NOTE has been added to this issue. ====================================================================== http://www.dbmail.org/mantis/view.php?id=1038 ====================================================================== Reported By: ALyarskiy Assigned To: ====================================================================== Project: DBMail Issue ID: 1038 Category: General Reproducibility: sometimes Severity: major Priority: normal Status: new target: ====================================================================== Date Submitted: 20-Jan-14 12:42 CET Last Modified: 22-Jan-14 08:19 CET ====================================================================== Summary: UTF8 support. Sortfield generated ignoring multibyte symbols. Description: dm_messages.c: function _header_cache ..... if(issubject) { char *s, *t = dm_base_subject(value); s = dbmail_iconv_str_to_db(t, charset); g_strlcpy(sortfield, s, CACHE_WIDTH-1); g_free(s); g_free(t); } .....
It does not work correctly with multibyte strings. ====================================================================== ---------------------------------------------------------------------- (0003625) paul (administrator) - 20-Jan-14 12:51 http://www.dbmail.org/mantis/view.php?id=1038#c3625 ---------------------------------------------------------------------- Please provide steps to reproduce, or better yet: a patch that is validated to fix this problem by a unit-test that exersizes it. ---------------------------------------------------------------------- (0003626) ALyarskiy (reporter) - 22-Jan-14 08:19 http://www.dbmail.org/mantis/view.php?id=1038#c3626 ---------------------------------------------------------------------- Possible way I see at this moment is to process headers through forced utf8 encoding. Like that: char *s, *t = dm_base_subject(value); s = dbmail_iconv_str_to_utf8(t, charset); ... PROCESSING ... ... if db_encoding != utf8: dbmail_iconv_str_to_db() Other way is to nail database encoding to utf8. DB backends limitations: Oracle supports 4-byte characters Postgres supports 4-byte characters SQLite supports 4-byte characters Mysql default utf8 is 3-byte, but it can handle 4-byte characters since version 5.5.3 by using "utf8mb4" encoding 4 byte characters are supplementary (http://www.i18nguy.com/unicode/supplementary-test.html). So there is possible problem with mysql version prior to 5.5.3, later versions will require db-update to switch from 3-byte to 4-byte (http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html). At this moment I have patch that works with utf8 headers (postgres backend). Will provide it right after some tests. Issue History Date Modified Username Field Change ====================================================================== 20-Jan-14 12:42 ALyarskiy New Issue 20-Jan-14 12:51 paul Note Added: 0003625 22-Jan-14 08:19 ALyarskiy Note Added: 0003626 ====================================================================== _______________________________________________ Dbmail-dev mailing list [email protected] http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev
