Hey,

That does not make sense, since neither curl nor PHP do any
kind of conversion like that.
Are you sure that you're not looking at the output from an XML
processor that has mangled utf-8 -> iso-8859-1 ??
(expat has source and target encodings that can be set separately),
And are you using something like mbstring with transparent encoding
translation turned on?

--Wez.


On 08/27/02, "Merijn van den Kroonenberg" <[EMAIL PROTECTED]> wrote:
> Hello List,
> 
> I have a problem with the php CURL module and UTF-8 data.
> My php script uses curl to do a post to a perl/cgi script. This perl script
> returns UTF-8 encoded XML. The perl script returns utf-8, i have verified
> that using the webserver logfiles, but the data that i receive in $result
> (see below) is decoded to ISO-8859-1.
> 
>     $ch = curl_init();
>     curl_setopt($ch, CURLOPT_URL, $post_url);
>     curl_setopt($ch, CURLOPT_HEADER, 0);
>     curl_setopt($ch, CURLOPT_VERBOSE, 0);
>     curl_setopt($ch, CURLOPT_POST, 1);
>     curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
>     curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
>     $result = curl_exec ($ch);// #### UTF compatible?
>     curl_close ($ch);
> 
> I did some further testing, and i found that this behaviour is not
> consistent. Actually i am pretty puzzled about this.
> 
> I was testing with a xml document that
> contained only the following multi byte utf chacracter:
> \303\253    (octal utf8) (LATIN SMALL LETTER E WITH DIAERESIS)
> The output from CURL got automatically decoded to latin1.
> 
> Then after that i tested with another xml document that
> contained the following multi byte utf character:
> \342\202\254 (octal utf8) (EURO SIGN)
> I was suprised to see that the output was now correct UTF-8.
> 
> Now i modified the first document and inserted the EURO SIGN in this
> document. When i process this document again, the CURL output is UTF-8. So
> it seems the output of CURL depends on what it detects on its imput, and it
> will try to convert the data to latin1 if possible??
> 
> Does anyone know how i can disable this behaviour? For me, CURL should not
> do any en/de-coding of my data.
> 
> I also looked around at the cURL library site (http://curl.haxx.se/) of the
> developer of CURL. In message
> http://curl.haxx.se/mail/curlphp-2001-02/0005.html the cURL developer
> indicates that the libraries do not care about character sets, and that it
> might have something to do with the implementation into PHP.
> 
> If this is true, then theres probably not much i can do about it. If its the
> case, please let me know, so i can find an alternative.



-- 
PHP Development Mailing List <http://www.php.net/>
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to