Hi!

They calculate the total width of a string based on "east asian width"
property, which is still valid to give a rough measurement of the
rendered string.

OK, I guess if it's some kind of special calculation that doesn't follow from others it should be preserved, there are tons of such special functions in PHP.

That's a common problem, IIRC PHP 6 converters have configurable error modes
for that. Don't unicode_set_error_handler() and unicode_set_error_mode() do
what you want?

I guess it isn't what I want. If my understanding is correct, a
handler set by unicode_set_error_handler() merely deals with the
aftermath and cannot interact with the converter.  There are good

That depends. For some error modes, it says to converter to replace invalid chars with some other char or skip it. You can't however now specify custom mappings (I'm not sure ICU allows that, but maybe it can be simulated...). Here the question is - is it really worth to keep whole separate conversion system for just this, or can it be done with standard conversion, possibly somewhat tweaked?

In addition to these, shouldn't there be any case where one have to
manipulate Unicode strings on per-coded-character-basis rather than
per-grapheme-basis just like substr() in PHP6?

In PHP 6 right now it's actually the only case, grapheme functions not even ported to PHP 6 yet (I know, not good) - but that's what regular str* functions should be doing, right?
--
Stanislav Malyshev, Zend Software Architect
s...@zend.com   http://www.zend.com/
(408)253-8829   MSN: s...@zend.com

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to