In PHP4/5 \xC4 and \x85 are not characters. They are bytes.

They are both. In PHP 5, character and byte is the same. In Unicode, it's not.

I can't pay such price. You are reducing available coding options and want

Then you can't use Unicode, at least not directly - you would have to convert all your unicode data back to bytes and work with them on that level. Unicode works on character level, you want to work on byte level, so somewhere on the way translation should happen. We will try to make it easier, but I don't think it's reasonable to expect that code based on this assumption would work without any changes whatsoever in php 6.

If I take a look at ext/unicode/unicode.c, I see more PHP_FUNCTION
functions. I don't know PHP6 release schedule. If PHP6 is approaching RC

ext/unicode as it is now is very incomplete. It will be improved quite soon. I don't want to announce things prematurely, but please just have a little patience and you'll see the improvement.

stage, maybe docs can be updated to inform about these functions. PHP
provides API for PHP scripts developers. Strongest API part is good
documentation. I shouldn't have to dig through C sources in order to learn
about available interpreter features. If you write code now and document
it later, you won't document it or it will take some time and lots of bug
reports to sync sources with manual.

Nobody expects you to dig through C sources, and of course documentation is important. However the basic assumption of Unicode that characters and bytes are not the same is something that I wouln't expect to change. Of course, having docs that describe common unicode pitfalls and how to work around them is very important too. I think once we are closer to releasing it would become higher priority.
--
Stanislav Malyshev, Zend Software Architect
[EMAIL PROTECTED]   http://www.zend.com/
(408)253-8829   MSN: [EMAIL PROTECTED]

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to