On 03/12/2012 03:05 AM, Yasuo Ohgaki wrote:
> Hi
> 
> I think following PHP 5.4.0 NEWS entry is misleading.
> 
>   . Changed default value of "default_charset" php.ini option from ISO-8859-1 
> to
>     UTF-8. (Rasmus)

Yes, I have fixed that now.

> I thought default_charset became UTF-8, so I was expecting
> following HTTP header.
> 
> content-type  text/html; charset=UTF-8
> 
> However, I got empty charset (missing 'charset=UTF-8').
> So I looked up to source and found the line in SAPI.h
> 
> 293   #define SAPI_DEFAULT_CHARSET        ""
> 
> Empty string should be "UTF-8", isn't it?

No, we can't force an output charset on people since it would end up
breaking a lot of sites.

>  - php.ini's default_charset should be UTF-8.
>  - determine_charset() should not blindly default to UTF-8 when there
> are no hint.
> 
> Old htmlentities/htmlspecialchars actually determines charset from
> default_charset/mbstring.internal_encoding/etc. I think old behavior
> is better than now.
> 
> How about make determine_charset() behaves like 5.3 and set the
> SAPI_DEFAULT_CHARSET to "UTF-8"?

PHP 5.3's determine_charset behaves exactly like 5.4's. In 5.3 we have:

    if (charset_hint == NULL)
                return cs_8859_1;

and in 5.4 we have:

    if (charset_hint == NULL)
                return cs_utf_8;

So there is no difference in their guessing when there is no hint, the
only difference is that in 5.4 we choose utf8 and in 5.3 we choose
8859-1 in that case.

-Rasmus

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to