In PHP4/5 \xC4 and \x85 are not characters. They are bytes.
They are both. In PHP 5, character and byte is the same. In Unicode, it's not.
I can't pay such price. You are reducing available coding options and want
Then you can't use Unicode, at least not directly - you would have to convert all your unicode data back to bytes and work with them on that level. Unicode works on character level, you want to work on byte level, so somewhere on the way translation should happen. We will try to make it easier, but I don't think it's reasonable to expect that code based on this assumption would work without any changes whatsoever in php 6.
If I take a look at ext/unicode/unicode.c, I see more PHP_FUNCTION functions. I don't know PHP6 release schedule. If PHP6 is approaching RC
ext/unicode as it is now is very incomplete. It will be improved quite soon. I don't want to announce things prematurely, but please just have a little patience and you'll see the improvement.
stage, maybe docs can be updated to inform about these functions. PHP provides API for PHP scripts developers. Strongest API part is good documentation. I shouldn't have to dig through C sources in order to learn about available interpreter features. If you write code now and document it later, you won't document it or it will take some time and lots of bug reports to sync sources with manual.
Nobody expects you to dig through C sources, and of course documentation is important. However the basic assumption of Unicode that characters and bytes are not the same is something that I wouln't expect to change. Of course, having docs that describe common unicode pitfalls and how to work around them is very important too. I think once we are closer to releasing it would become higher priority.
-- Stanislav Malyshev, Zend Software Architect [EMAIL PROTECTED] http://www.zend.com/ (408)253-8829 MSN: [EMAIL PROTECTED] -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php