On Tue, 3 Oct 2006 01:15:59 +0300, "Ahmad Al-Twaijiry" wrote:
> Hi everyone
>
> in my PHP code I use the following command to set a cookie with
> non-english word (UTF-8) :
>
> @setcookie ("UserName",$Check[1]);
>
> and in my html page I get this cookie using javascript :
[Snipped]
> but the result from writing the cookie using javascript is garbage, I
> don't get the right word !!
The problem is that JavaScript uses UTF-16, so you
either have to store the cookie as UTF-16 or do your
own UTF-8 decoding in JavaScript.
For example, consider the string "åäö", containing
the three funny characters in the Swedish language
(åäö). These characters are encoded
as <c3 a5 c3 a4 c3 b6> in UTF-8, and PHP stores these
in the cookie as:
%C3%A5%C3%A4%C3%B6
Example:
------------------------------------------------------
<?php
setcookie ('UserName', "\xc3\xa5\xc3\xa4\xc3\xb6");
// setcookie ('UserName', "åäö");
header ('Content-Type: text/html; charset=utf-8');
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<title>UTF-8 flavoured cookies</title>
<p>
<script type="text/javascript">
document.write(document.cookie);
</script>
------------------------------------------------------
The unescape() function in JavaScript converts
these characters to the Unicode code points
<00c3 00a5 00c3 00a4 00c3 00b6> which, of course,
is not what you want.
Example:
------------------------------------------------------
<?php
setcookie ('UserName', "\xc3\xa5\xc3\xa4\xc3\xb6");
// setcookie ('UserName', "åäö");
header ('Content-Type: text/html; charset=utf-8');
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<title>UTF-8 flavoured cookies</title>
<p>
<script type="text/javascript">
var s = unescape(document.cookie);
var t = "";
for (var i = 0; i < s.length; i++) {
var c = s.charCodeAt(i);
t += c < 128 ? String.fromCharCode(c) : c.toString(16) + " ";
}
document.write(t);
</script>
------------------------------------------------------
While there are no doubt better ways to solve this,
you /could/ use the unescape() function to convert the
percent-encoded characters to unicode code point, and
then write your own UTF-8 decoder to do the rest.
Example:
(This is an old C function hammered into JavaScript
shape. It is likely to be a horrible implementation
in JavaScript. The error checking adds a bit of bloat.
Note that the utf_8_decode function supports the full
Unicode range, while JavaScript doesn't. )
------------------------------------------------------
<?php
setcookie ('UserName', "\xc3\xa5\xc3\xa4\xc3\xb6");
header ('Content-Type: text/html; charset=utf-8');
?>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<title>UTF-8 flavoured cookies</title>
<p>
<script type="text/javascript">
function utf_8_decode (sin)
{
function octet_count (c)
{
var octet_counts = [
/* c0 */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/* d0 */ 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
/* e0 */ 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
/* f0 */ 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 0, 0
];
return c < 128 ? 1 :
c < 192 ? 0 : octet_counts [(c&255)-192];
}
var octet0_masks = [ 0x00,0x7f,0x1f,0x0f,0x07,0x03,0x01 ];
var sout = "";
var add;
for (var si = 0; si < sin.length; si += add) {
var c = sin.charCodeAt(si);
add = octet_count(c);
if (si+add <= sin.length) {
var u = c & octet0_masks[add];
var ci;
for (ci = 1; (ci < add) && ((sin.charCodeAt(si+ci)&0xc0) == 0x80);
ci++)
u = (u<<6) | (sin.charCodeAt(si+ci) & 0x3f);
if (ci == add) {
sout += String.fromCharCode (u);
} else {
// Invalid UTF-8 sequence. Should probably throw() instead.
sout += "\ufffd"; // Replacement character.
add = 1;
}
} else {
// Invalid UTF-8 sequence. Should probably throw() instead.
sout += "\ufffd"; // Replacement character.
add = 1;
}
}
return sout;
}
document.write (utf_8_decode(unescape(document.cookie)));
</script>
------------------------------------------------------
> BTW:
> * I also tried the php function setrawcookie and I get the same problem
> * I use <META http-equiv=Content-Type content="text/html;
> charset=utf-8"> in my page
The <META> thing might be good for storing pages
on disk, but on the web you should use real HTTP
headers.
--nfe
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php