A NOTE has been added to this issue. 
====================================================================== 
http://www.dbmail.org/mantis/view.php?id=1038 
====================================================================== 
Reported By:                ALyarskiy
Assigned To:                
====================================================================== 
Project:                    DBMail
Issue ID:                   1038
Category:                   General
Reproducibility:            sometimes
Severity:                   major
Priority:                   normal
Status:                     new
target:                      
====================================================================== 
Date Submitted:             20-Jan-14 12:42 CET
Last Modified:              23-Jan-14 09:43 CET
====================================================================== 
Summary:                    UTF8 support. Sortfield generated ignoring multibyte
symbols.
Description: 
dm_messages.c:
function _header_cache
.....
        if(issubject) {
                char *s, *t = dm_base_subject(value);
                s = dbmail_iconv_str_to_db(t, charset);
                g_strlcpy(sortfield, s, CACHE_WIDTH-1);
                g_free(s);
                g_free(t);
        }
.....

It does not work correctly with multibyte strings.
====================================================================== 

---------------------------------------------------------------------- 
 (0003625) paul (administrator) - 20-Jan-14 12:51
 http://www.dbmail.org/mantis/view.php?id=1038#c3625 
---------------------------------------------------------------------- 
Please provide steps to reproduce, or better yet: a patch that is validated
to fix this problem by a unit-test that exersizes it. 

---------------------------------------------------------------------- 
 (0003626) ALyarskiy (reporter) - 22-Jan-14 08:19
 http://www.dbmail.org/mantis/view.php?id=1038#c3626 
---------------------------------------------------------------------- 
Possible way I see at this moment is to process headers through forced utf8
encoding. Like that:
char *s, *t = dm_base_subject(value);

s = dbmail_iconv_str_to_utf8(t, charset);
... PROCESSING ...
... if db_encoding != utf8:
dbmail_iconv_str_to_db()

Other way is to nail database encoding to utf8.

DB backends limitations:
Oracle supports 4-byte characters
Postgres supports 4-byte characters
SQLite supports 4-byte characters
Mysql default utf8 is 3-byte, but it can handle 4-byte characters since
version 5.5.3 by using "utf8mb4" encoding

4 byte characters are supplementary
(http://www.i18nguy.com/unicode/supplementary-test.html).

So there is possible problem with mysql version prior to 5.5.3, later
versions will require db-update to switch from 3-byte to 4-byte
(http://dev.mysql.com/doc/refman/5.5/en/charset-unicode-upgrading.html).

At this moment I have patch that works with utf8 headers (postgres
backend). Will provide it right after some tests. 

---------------------------------------------------------------------- 
 (0003627) ALyarskiy (reporter) - 23-Jan-14 09:04
 http://www.dbmail.org/mantis/view.php?id=1038#c3627 
---------------------------------------------------------------------- 
Ok, here is patch. It is my first experience with C, so the patch should be
reviewed by a real developer =)

Patch includes:
1. New function _header_exists to check if header already exists. It is
not really cool to try to insert and check for errors.
2. UTF8 headers support. Assuming max size 4 bytes (some issues with
mysql, see previuos note).
3. Added few trace messages.

Example header:
RAW=[=?koi8-r?Q?[XXXXXX]_[33666]_=FA=C1=D0=D2=CF=D3_=CE=C1_=D4=C5=C8._=D0=CF?=
   
=?koi8-r?B?xMTF0tbL1SAo+sHQ0s/TIMTP0C4gyc7Gz9LNwcPJySksIM7Fy8/S0sXL1M7P?=  
   
=?koi8-r?B?xSDazsHexc7JxSDEz9AuIMHU0snC1dTBINPPINrOwd7FzsnFzSDQzyDVzc/M?=  
   
=?koi8-r?B?3sHOycAgxMzRINTJ0MEgxM/Hz9fP0sEsIM7FINPX0drBzs7Px88g0yDc1MnN?=  
    =?koi8-r?Q?_=C1=D4=D2=C9=C2=D5=D4=CF=CD?=] 

---------------------------------------------------------------------- 
 (0003628) paul (administrator) - 23-Jan-14 09:43
 http://www.dbmail.org/mantis/view.php?id=1038#c3628 
---------------------------------------------------------------------- 
Ok, you're on to something here, but I'm rejecting the patch for following
reasons:

- please use git-diff to generate the patch, or better yest: fork on
github, clone, hack, test, commit, push, and send me a pull request. Since
you're working off the 3.1 code that will allow easy forward porting to the
master branch.

- the patch does way too much. You're fixing a non-existing problem with
the new _header_exists function. That case is already well covered in the
code. Also, *all* queries *must* happen inside a TRY/CATCH/FINALLY block.

- I don't see any unit-tests that demonstrate the problem and the fix:
please expand tests/check_dbmail_message.c 

Issue History 
Date Modified    Username       Field                    Change               
====================================================================== 
20-Jan-14 12:42  ALyarskiy      New Issue                                    
20-Jan-14 12:51  paul           Note Added: 0003625                          
22-Jan-14 08:19  ALyarskiy      Note Added: 0003626                          
23-Jan-14 09:04  ALyarskiy      Note Added: 0003627                          
23-Jan-14 09:04  ALyarskiy      File Added: utf8_header.patch                   

23-Jan-14 09:43  paul           Note Added: 0003628                          
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://mailman.fastxs.nl/cgi-bin/mailman/listinfo/dbmail-dev

Reply via email to