On Wed, 2008-09-03 at 13:22 -0700, Brian Aker wrote:
> Hi!
> 
> On Sep 3, 2008, at 10:03 AM, Jim Starkey wrote:
> 
> > I'm planning to use ICU (IBM's International Components for Unicode)  
> > for the actual collations.  It's licensed under MIT's X11 license,  
> > and is GPL compatible.
> 
> Postgres made a go at using that (and so did one of the "P" languages  
> according to Tim Bray). They all found is to be a less then desirable  
> library from the stand point of performance.

PHP 6 is using ICU and therefore Utf-16 internally which means lots of
time is spent on converting everything (script code [identifiers, ...],
user input, data from different libraries PHP uses, ...) is converted
(mostly) from Utf-8 to utf-16, processed and then converted back to
utf-8, this makes the code more complex and simple benchmarks I did in
the early times of the development showed quite some impact ...

And well, PHP 6 is in development for a few years and we recently merged
all features, except Unicode-related stuff, back to 5.3 -- you can image
how nice using ICU works ;-)

I think Yahoo! (who sponsored lots of the initial work) evaluated some
other options but found ICU being the best way, even though it's
expensive and well, I guess for them performance really matters.

johannes


_______________________________________________
Mailing list: https://launchpad.net/~drizzle-discuss
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~drizzle-discuss
More help   : https://help.launchpad.net/ListHelp

Reply via email to