A NOTE has been added to this issue. 
====================================================================== 
http://dbmail.org/mantis/view.php?id=655 
====================================================================== 
Reported By:                idk
Assigned To:                
====================================================================== 
Project:                    DBMail
Issue ID:                   655
Category:                   Database layer
Reproducibility:            random
Severity:                   minor
Priority:                   normal
Status:                     new
target:                      
====================================================================== 
Date Submitted:             15-Nov-07 01:32 CET
Last Modified:              15-Nov-07 10:28 CET
====================================================================== 
Summary:                    MIME headers are incorrectly parsed into cached
tables
Description: 
Some messages with MIME header encoding are wrongly inserted into
dbmail_*field and dbmail_headervalue. It seems like double encoding into
utf8.

E.g. From field for some message (see Additional Information) has this two
instances:

SELECT physmessage_id, HEX(fromname) FROM dbmail_fromfield
WHERE physmessage_id BETWEEN 399826 AND 399827 

399826
4F6E6C696E652052657A65727661C48D6EC3AD2053797374C3A96D20534D4F534B 

399827
4F6E6C696E652052657A65727661C384C28D6EC383C2AD2053797374C383C2A96D20534D4F534B


Compare it. The first one is correct. Without accents it is "Online
Rezervacni System SMOSK". The second one begining at char 16 where is
UNICODE LATIN SMALL LETTER C WITH CARON \u010D, in utf-8 encoding C48D, is
corrupted (here is C384C28D). Every non US-ASCII character is interpreted
as 4 bytes instead of 2 bytes.

When you convert the first byte C4 from iso-8859-2 into utf-8, you get
C384, and when you convert 8D, you get C28D. So "corrupted" utf-8 string
may be made by iconv -f iso-8859-2 -t utf8 (may be not iso-8859-2, but
windows-1250, but the encoding = utf8, default_msg_encoding = utf8,
database is utf8 too, environment is en_US.UTF-8, nowhere iso/win).
====================================================================== 

---------------------------------------------------------------------- 
 idk - 15-Nov-07 10:28  
---------------------------------------------------------------------- 
Ah, don't disturb my characters! :o)

This is discrimination, my nation is valid EU member! Shame! Czech
characters are valid UNICODE characters - and Mantis brokes them. :o)

So, Additional Information again (accents has *):

Decoded string is with Czech characters:

Subject: Zrus*eni* objedna*vky www.smosk.cz
From: "Online Rezervac*ni* Syste*m SMOSK" <[EMAIL PROTECTED]>

But about half-and-half headers are inserted as:

Subject: ZruA*i*enA*** 

Issue History 
Date Modified   Username       Field                    Change               
====================================================================== 
15-Nov-07 01:32 idk            New Issue                                    
15-Nov-07 10:28 idk            Note Added: 0002410                          
======================================================================

_______________________________________________
Dbmail-dev mailing list
Dbmail-dev@dbmail.org
http://twister.fastxs.net/mailman/listinfo/dbmail-dev

Reply via email to