ID: 45037
Comment by: JeanLuc dot TRIPLET at yahoo dot fr
Reported By: JeanLuc dot TRIPLET at yahoo dot fr
Status: Open
Bug Type: WDDX related
Operating System: Windows
PHP Version: 5.2.6
New Comment:
Modifying wddx.c as below solves the problem (php_wddx_serialize_var
also encode parameter names in addition to parameter values) :
Original wddx.c :
void php_wddx_serialize_var(wddx_packet *packet, zval *var, char *name,
int name
_len TSRMLS_DC)
{
char *tmp_buf;
char *name_esc;
int name_esc_len;
HashTable *ht;
if (name) {
name_esc = php_escape_html_entities(name, name_len,
&name_esc_len, 0, ENT_QUOTES, NULL TSRMLS_CC);
tmp_buf = emalloc(name_esc_len + sizeof(WDDX_VAR_S));
snprintf(tmp_buf, name_esc_len + sizeof(WDDX_VAR_S),
WDDX_VAR_S, name_esc);
php_wddx_add_chunk(packet, tmp_buf);
efree(tmp_buf);
efree(name_esc);
}
Mofified wddx.c :
void php_wddx_serialize_var(wddx_packet *packet, zval *var, char *name,
int name
_len TSRMLS_DC)
{
char *tmp_buf;
char *enc;
char *name_esc;
int name_esc_len;
int enc_len;
HashTable *ht;
if (name) {
name_esc = php_escape_html_entities(name, name_len,
&name_esc_len, 0, ENT_QUOTES, NULL TSRMLS_CC);
enc = xml_utf8_encode(name_esc, name_esc_len, &enc_len,
"ISO-8859-1");
tmp_buf = emalloc(enc_len + sizeof(WDDX_VAR_S));
snprintf(tmp_buf, enc_len + sizeof(WDDX_VAR_S),
WDDX_VAR_S, enc);
php_wddx_add_chunk(packet, tmp_buf);
efree(tmp_buf);
efree(name_esc);
efree(enc);
Could you, please, include some modification like this one in future
versions.
Thanks in advance.
Previous Comments:
------------------------------------------------------------------------
[2008-05-19 10:13:11] JeanLuc dot TRIPLET at yahoo dot fr
Description:
------------
wddx_add_vars() correctly converts values to UTF-8, but doesn't convert
var names to UTF-8, so wddx_deserialize() return an empty array as XML
packet contains var names with accent.
below is a script showing that string values are converted to UTF-8 by
wddx_add_vars, but var names are not converted. It also show that
wddx_deserialize() works fine when input packet contains UTF_8 encoded
var names manually, but doesn't work when var names are let accentuated
by wddx_add_vars().
Could you please, modify wddx_add_vars, to UTF_8 encode var names as
already done for string values ?
Thank for your help.
Reproduce code:
---------------
<?php
// If varname is ascii, unserialize is OK //
$packet_id = wddx_packet_start("PHP");
$varname = "value é";
wddx_add_vars($packet_id,"varname");
$packet = wddx_packet_end($packet_id);
var_dump ($packet);
echo "\n\n";
$result = wddx_deserialize($packet);
var_dump ($result);
// If varname is non_ascii, unserialize return array(0) {} //
$packet_id = wddx_packet_start("PHP");
$varnameé = "value é";
wddx_add_vars($packet_id,"varnameé");
$packet = wddx_packet_end($packet_id);
var_dump ($packet);
$result = wddx_deserialize($packet);
var_dump ($result);
// If packet contains non_ascii UTF-8 encoded varname, unserialize is
OK //
$packet = "<wddxPacket
version='1.0'><header><comment>PHP</comment></header><data><struct><var
name='varnameé'><string>value
é</string></var></struct></data></wddxPacket>";
var_dump ($packet);
$result = wddx_deserialize($packet);
var_dump ($result);
?>
Expected result:
----------------
string(159) "value é"
array(1) { ["varname"]=> string(7) "value é" }
string(160) "value é"
array(1) { ["varnameé"]=> string(7) "value é" }
string(161) "value é"
array(1) { ["varnameé"]=> string(7) "value é" }
Actual result:
--------------
string(159) "value é"
array(1) { ["varname"]=> string(7) "value é" }
string(160) "value é"
array(0) { }
string(161) "value é"
array(1) { ["varnameé"]=> string(7) "value é" }
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=45037&edit=1