ID: 40052 User updated by: tim at whiteinteractive dot com Reported By: tim at whiteinteractive dot com Status: Open Bug Type: *Unicode Issues Operating System: Mac OS X & Redhat PHP Version: 4.4.5RC1 New Comment:
Interactive example: http://whiteinteractive.com/utf8bug/test2.php and source: http://whiteinteractive.com/utf8bug/test2.phps Previous Comments: ------------------------------------------------------------------------ [2007-01-07 22:11:27] tim at whiteinteractive dot com Description: ------------ Problem with *De*serialization of multibyte characters in WDDX. Correct sequences of <char> nodes (describing UTF8 characters) are deserializing incorrectly. I am experiencing this with 4.4.5 (snapshot)on Mac OS X and also 4.3.11 on Redhat. The newer version serializes into <char> nodes correctly whereas the older version seems to leave these characters raw, but in both versions the deserialization is incorrect. Reproduce code: --------------- http://whiteinteractive.com/utf8bug/test.phps Expected result: ---------------- -- Create double byte character -- string(2) "£" dec: 194 163 bin: 11000010 10100011 hex: C2 A3 -- Serialize with wddx_serialize_value -- <wddxPacket version='1.0'><header/><data><string><char code='C2'/><char code='A3'/></string></data></wddxPacket> -- Deserialize with wddx_deserialize -- string(2) "£" dec: 194 163 bin: 11000010 10100011 hex: C2 A3 Actual result: -------------- -- Create double byte character -- string(2) "£" dec: 194 163 bin: 11000010 10100011 hex: C2 A3 -- Serialize with wddx_serialize_value -- <wddxPacket version='1.0'><header/><data><string><char code='C2'/><char code='A3'/></string></data></wddxPacket> -- Deserialize with wddx_deserialize -- string(2) "��" dec: 128 163 bin: 10000000 10100011 hex: 80 A3 ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=40052&edit=1