On Mon, Mar 12, 2012 at 2:49 PM, Rasmus Lerdorf <ras...@lerdorf.com> wrote: > I caused this situation myself by not explicitly differentiating between > the default charset for the internal htmlspecialchars() and > htmlentities() functions and the output charset directive ini directive > default_charset. > > The idea behind the default_charset ini directive was to act as the > charset that gets specified in the HTTP Content-type header if you do > not explicitly send your own Content-type header with the header() > function. This has been muddied a bit by the fact that > htmlspecialchars/htmlentities can take it into account when it is trying > to choose which encoding to use when handling data passed to it. This > isn't done by default since it actually makes little sense. It is only > done if you pass an empty string as the encoding argument. If you don't > pass anything at all the default is UTF-8 in 5.4. In 5.3 this was > ISO-8859-1. > > And here is where the confusion comes in. We, myself included, have told > people that they can get the 5.3 behaviour back by setting the > default_charset ini directive to iso-8859-1. But, this is only true if > they are forcing htmlspecialchars/htmlentities to check that setting > with an empty string as the encoding arg. Most apps just do > htmlspecialchars($str) and nothing else. Plus, it is really not a good > idea to tie the internal encoding of data being passed to these > functions to the output charset. You should be able to change the output > charset without worrying about your runtime encoding at that level. > > What this effectively means is that we are asking people to go through > their code and add an explicit charset to all htmlspecialchars() and > htmlentities() calls. I think this will be a hurdle for 5.4 adoption. > > What we really need is what we added in PHP 6. A runtime encoding ini > setting that is distinct from the output charset which we can use here. > That would allow people to fix all their legacy code to a specific > runtime encoding with a single ini setting instead of changing thousands > of lines of code. I propose that we add such a directive to 5.4.1 to > ease migration. +1, especially for non-utf8 applications.
thanks > > See https://bugs.php.net/61354 for the first signs of grumbling about > this one. As more people migrate I have a feeling this will end up being > the most difficult part of the migration. > > -Rasmus > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: http://www.php.net/unsub.php > -- Laruence Xinchen Hui http://www.laruence.com/ -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php