On Mon, Mar 12, 2012 at 2:49 PM, Rasmus Lerdorf <ras...@lerdorf.com> wrote:
> I caused this situation myself by not explicitly differentiating between
> the default charset for the internal htmlspecialchars() and
> htmlentities() functions and the output charset directive ini directive
> default_charset.
>
> The idea behind the default_charset ini directive was to act as the
> charset that gets specified in the HTTP Content-type header if you do
> not explicitly send your own Content-type header with the header()
> function. This has been muddied a bit by the fact that
> htmlspecialchars/htmlentities can take it into account when it is trying
> to choose which encoding to use when handling data passed to it. This
> isn't done by default since it actually makes little sense. It is only
> done if you pass an empty string as the encoding argument. If you don't
> pass anything at all the default is UTF-8 in 5.4. In 5.3 this was
> ISO-8859-1.
>
> And here is where the confusion comes in. We, myself included, have told
> people that they can get the 5.3 behaviour back by setting the
> default_charset ini directive to iso-8859-1. But, this is only true if
> they are forcing htmlspecialchars/htmlentities to check that setting
> with an empty string as the encoding arg. Most apps just do
> htmlspecialchars($str) and nothing else. Plus, it is really not a good
> idea to tie the internal encoding of data being passed to these
> functions to the output charset. You should be able to change the output
> charset without worrying about your runtime encoding at that level.
>
> What this effectively means is that we are asking people to go through
> their code and add an explicit charset to all htmlspecialchars() and
> htmlentities() calls. I think this will be a hurdle for 5.4 adoption.
>
> What we really need is what we added in PHP 6. A runtime encoding ini
> setting that is distinct from the output charset which we can use here.
> That would allow people to fix all their legacy code to a specific
> runtime encoding with a single ini setting instead of changing thousands
> of lines of code. I propose that we add such a directive to 5.4.1 to
> ease migration.
+1, especially for non-utf8 applications.

thanks
>
> See https://bugs.php.net/61354 for the first signs of grumbling about
> this one. As more people migrate I have a feeling this will end up being
> the most difficult part of the migration.
>
> -Rasmus
>
> --
> PHP Internals - PHP Runtime Development Mailing List
> To unsubscribe, visit: http://www.php.net/unsub.php
>



-- 
Laruence  Xinchen Hui
http://www.laruence.com/

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to