Hello all, I have a very very strange problem.

Perl seems to be reencoding UTF-8 data.

Versions: Perl 5.8.5, POE 0.9999, CentOS 4.4.

I added a bunch of cookie crumbs to trace the problem.  This is what I'm seeing.


My reverse HTTP proxy outputs the following :

http: content=[["set", "START-USER_", "value", "\xC2\xA9t\xC2\xA9"]]
http: chars=42
http: Content-Length=42

POE::HTTPD::Filter->put() converts it to :

Filter::HTTPD: content=[["set", "START-USER_", "value", "\xC2\xA9t\xC2\xA9"]]
Filter::HTTPD: chars=42
Filter::HTTPD: Content-Length=42
Filter::HTTPD: [-1]='HTTP/1.1 200 (OK) (OK)
Date: Sat, 22 Mar 2008 05:11:06 GMT
Server: POE HTTPD Component/0.10-PG (5.008005)
Content-Length: 42
Content-Type: application/json
Set-Cookie: BID=2439; path=/
X-POE-XUL: 0.04

[["set", "START-USER_", "value", "\xC2\xA9t\xC2\xA9"]]'
Filter::HTTPD: length=252

POE::Driver::SysRW->put() then sees the following :

Driver::SysRW: put='HTTP/1.1 200 (OK) (OK)
Date: Sat, 22 Mar 2008 05:11:06 GMT
Server: POE HTTPD Component/0.10-PG (5.008005)
Content-Length: 42
Content-Type: application/json
Set-Cookie: BID=2439; path=/
X-POE-XUL: 0.04

[["set", "START-USER_", "value", "\xC3\x83\xC2\xA9t\xC3\x83\xC2\xA9"]]'
Driver::SysRW: chars=256


WOAH!  "\xC2\xA9t\xC2\xA9" became "\xC3\x83\xC2\xA9t\xC3\x83\xC2\xA9" !  And
the length changed.  I HATE YOU MILKMAN ENCODING-COCK-UP!

(Note that when I write \xC2 in this email, i'm seeing the binary octet C2 in
the data.)

The work around is for me to use JSON::XS->ascii.  But this still boggles me. 
Anyone understand UTF-8 encoding?  Or have any pointers?

-Philip

Reply via email to