Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

Stanislav Malyshev Fri, 31 Jul 2009 17:12:33 -0700

Hi!

They calculate the total width of a string based on "east asian width"
property, which is still valid to give a rough measurement of the
rendered string.

OK, I guess if it's some kind of special calculation that doesn't followfrom others it should be preserved, there are tons of such specialfunctions in PHP.

That's a common problem, IIRC PHP 6 converters have configurable error modes
for that. Don't unicode_set_error_handler() and unicode_set_error_mode() do
what you want?


I guess it isn't what I want. If my understanding is correct, a
handler set by unicode_set_error_handler() merely deals with the
aftermath and cannot interact with the converter.  There are good

That depends. For some error modes, it says to converter to replaceinvalid chars with some other char or skip it. You can't however nowspecify custom mappings (I'm not sure ICU allows that, but maybe it canbe simulated...). Here the question is - is it really worth to keepwhole separate conversion system for just this, or can it be done withstandard conversion, possibly somewhat tweaked?

In addition to these, shouldn't there be any case where one have to
manipulate Unicode strings on per-coded-character-basis rather than
per-grapheme-basis just like substr() in PHP6?

In PHP 6 right now it's actually the only case, grapheme functions noteven ported to PHP 6 yet (I know, not good) - but that's what regularstr* functions should be doing, right?

--
Stanislav Malyshev, Zend Software Architect
s...@zend.com   http://www.zend.com/
(408)253-8829   MSN: s...@zend.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] Re: Alternative mbstring implementation using ICU

Reply via email to