On 03/12/2012 12:51 PM, Stas Malyshev wrote: > Hi! > >> But you can't necessarily hardcode the encoding if you are writing >> portable code. That's a bit like hardcoding a timezone. In order to >> write portable code you need to give people the ability to localize it. > > No, it's not like timezone at all. I have to support all timezones in a > global app, but I don't have to internally support every encoding on > Earth - having everything internally in UTF-8 works quite well, and a > lot of applications do exactly that - they have everything internally in > UTF-8 and only may convert when importing or exporting the data. I don't > see anything in using UTF-8 throughout the app/library that makes it > non-portable. However, if we allow to change defaults in > htmlspecialchars() etc. that essentially makes having defaults useless > as I'd have so explicitly specify UTF-8 each time - otherwise it's a > gamble what encoding I'd actually get.
If everything was UTF-8 we wouldn't have any of these issues. Unfortunately that isn't the case. The question is what to do with apps that need to deal with non UTF-8 data. Are we going to provide any help to them beyond just telling them to convert everything to UTF-8? We took steps in 5.4 to improve htmlspecialchars to understand more encodings and we have the concept of script_encoding and internal_encoding that is used both in the engine and in mbstring. Currently internal_encoding isn't checked by htmlspecialchars. If you pass it '' it checks script_encoding and default_charset which is a bit odd since neither directly relate to the encoding of the internal data you are feeding to it. So maybe a way to tackle this is to use the mbstring internal encoding when it is set as the htmlspecialchars default when it is called without an encoding arg. -Rasmus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php