Re: [PHP-DEV] Unicode support

Aleksey Tulinov Tue, 14 Oct 2014 08:10:02 -0700

On 14/10/14 14:00, Chris Wright wrote:

Chris,

Latter is referring to difficulties like "excess memory usage" and "rewrite
the language". I'm developing an open-source Unicode implementation library
(nunicode), and it doesn't consume any heap at all, it also works on native
binary strings, as PHP does. Hence i thinks that maybe it could help with at
least these two problems.


On the face of it, this implies a rather large performance hit and a
tendency to overflow the stack much more readily, do you have any
details on these elements?

I can't really tell if hit is going to be large before understandingwhat final result would be, at least approximately.

I can tell that internal complexity of nunicode is O(1) everywhere. I'mcomparing performance to ICU and nunicode mostly outperforms it. I'vecompiled some numbers here:https://bitbucket.org/alekseyt/nunicode#markdown-header-performance-considerations

Regarding stack, i'm not sure if get the point. As far as i'm concerned,library does not have recursive calls, it does not have internalrepresentation and does not allocate on stack aggressively. Everythingworks on immutable binary strings, stack will be used mostly forfunction calls.

But honestly, i feel like i'm not answering your question at all. Couldyou possibly clarify it?

I would appreciate if someone would point me to a good read or explain
collective opinion on this topic. I'm basically interested in the following
questions:


The only additional thing I can find quickly is something Pierre put
together earlier this year, when PHP6 (now 7) discussions were
started:
https://wiki.php.net/ideas/php6/unicode


Thank you, this is exactly what i was looking for.

I would appreciate if someone would comment on the following:

> Some of the keys point we need to take care of are:
>
> 1) UTF-8 storage
> 2) UTF-8 support for almost (if not all) existing string APIs
> 3) Performance
>

> As of today, I did not find any library covering at least two ofthese key points.

I think i could claim that nunicode is covering at least two key points,maybe all of them, but i'm not sure about point 2). API do includeoperations on strings, but this API is simply following standard stringfunctions (UTF equivalents of strcoll(), strchr(), strstr(), etc). Doesthat sound good or not?


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Unicode support

Reply via email to