ID: 34928
User updated by: clemens at gutweiler dot net
Reported By: clemens at gutweiler dot net
-Status: Feedback
+Status: Open
Bug Type: WDDX related
Operating System: Linux
PHP Version: 4.4.0
New Comment:
"Real" Unicode Chars does not work, too.
But utf8_encode() of chr() should return valid chars.
Test for "real" unicode chars:
<?php
header( 'Content-Type: text/html; charset=UTF-8' );
?>
<html>
<body>
<form method="post">
<textarea name="text"><?php echo $_POST['text']
?></textarea>
<input type="submit" />
</form>
</body>
</html>
<pre>
<?php
$text = 'umlaute: '.chr( 220 ).chr( 228 ).chr( 246 ).chr( 223 );
$text = utf8_encode( $text );
if( isset( $_POST['text'] ) ) $text = $_POST['text'];
var_dump( $text );
$wddx = wddx_serialize_value( $text );
var_dump( htmlentities( $wddx ) );
$data = wddx_deserialize( $wddx );
var_dump( $data );
$data = wddx_deserialize( '<?xml version="1.0" encoding="UTF-8"
?>'."\n".$wddx );
var_dump( $data );
$data = wddx_deserialize( '<?xml version="1.0" encoding="ISO-8859-1"
?>'."\n".$wddx );
var_dump( $data );
show_source( __FILE__ );
?>
</pre>
With PHP 4.4.0 and 5.0.5 i do not get valid UTF-8 chars back.
Previous Comments:
------------------------------------------------------------------------
[2005-10-21 19:23:42] [EMAIL PROTECTED]
Try to use real UTF8 chars instead of chr().
------------------------------------------------------------------------
[2005-10-20 12:01:24] clemens at gutweiler dot net
Why is this bug bogus?
It use utf8_encode to encode the data to UTF-8.
The manual says also: "Note: If you want to serialize non-ASCII
characters you have to convert your data to UTF-8 first (see
utf8_encode() and iconv()).".
So the code should be correct?
------------------------------------------------------------------------
[2005-10-20 11:20:53] [EMAIL PROTECTED]
Please do not submit the same bug more than once. An existing
bug report already describes this very problem. Even if you feel
that your issue is somewhat different, the resolution is likely
to be the same.
Thank you for your interest in PHP.
See bug #34913.
------------------------------------------------------------------------
[2005-10-20 11:01:33] clemens at gutweiler dot net
Description:
------------
umlaut characters in charset utf-8 get not correct en/decoded with
wddx_serialize_value resp. wddx_deserialize.
in php-5 the code with the xml-header and utf-8 encoding returns the
iso-8859-1 chars and not the utf-8 charts - that is a bug too, or?
Reproduce code:
---------------
<?php
header( 'Content-Type: text/html; charset=UTF-8' );
echo '<pre>';
$original = utf8_encode( 'umlaute: '.chr( 220 ).chr( 228 ).chr( 246
).chr( 223 ) );
var_dump( $original );
$wddx = wddx_serialize_value( $original );
#var_dump( htmlentities( $wddx ) );
$data = wddx_deserialize( $wddx );
var_dump( $data );
$data = wddx_deserialize( '<?xml version="1.0" encoding="UTF-8"
?>'."\n".$wddx );
var_dump( $data );
?>
Expected result:
----------------
string(17) "umlaute: Üäöß"
string(17) "umlaute: Üäöß"
string(17) "umlaute: Üäöß"
Actual result:
--------------
string(17) "umlaute: Üäöß"
string(17) "umlaute: ÿäöÿ"
string(17) "umlaute: ÿäöÿ"
------------------------------------------------------------------------
--
Edit this bug report at http://bugs.php.net/?id=34928&edit=1