Re: [PHP-DEV] Where are we ACTUALLY on Unicode?

Lester Caine Tue, 16 Mar 2010 12:04:48 -0700

Rasmus Lerdorf wrote:

On 03/16/2010 10:40 AM, dreamcat four wrote:

As for text files on disk, if they are unicode, they are most commonly
utf-8 too. So then, why use utf-16 as internal unicode representation
in Php? It doesn't really make a lot of sense for most regular people
who want to use Php for their web application. Unless they don't
really care how slow its gonna be converting everything, constantly...


Well, the obvious original reason is that ICU uses UTF-16 internally and
the logic was that we would be going in and out of ICU to do all the
various Unicode operations many more times than we would be interfacing
with external things like MySQL or files on disk.  You generally only
read or write a string once from an external source, but you may perform
multiple Unicode operations on that same string so avoiding a conversion
for each operation seems logical.


Which begs the question - is ICU actually the right base?

But I'd still like some feedback on my idea that until an operation needs to beable to handle multi byte character string processing, why not simply stay inUTF-8? No reason why a string variable can't be converted only when needed, andthen dropped back to UTF-8 if needed later? And if the user is only using singlebyte characters then the multi byte stuff never kicks in anyway? If you NEED rawspeed use the basic character set.


--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Where are we ACTUALLY on Unicode?

Reply via email to