On Mon, July 9, 2007 3:06 am, Stanislav Malyshev wrote: >> But now \xF0 isn't going to be ASCII 128 anymore, is it? > > ASCII doesn't have any characters beyond 0x7f AFAIK, but it doesn't > matter, I get what you mean. \xF0 in unicode mode would be U+00F0 of > course. Now how preg_match should handle it depends on preg_match.
I should have said "Extended ASCII". And, unfortunately, there are at least 3 commonly-used "Extended ASCII" out there, and, yes, this is exactly what Unicode is trying to solve. Only problem is, the data coming into most web apps is usually NOT UTF-16, nor even UTF-8, but "Windows Extended ASCII" (more or less) and most end users of PHP do not have the luxury of being able to have a dedicated server. So they are going to be stuck with their data getting totally munged into UTF-16 on new PHP installations and, if I'm following this thread correctly, NOT going to be able to get back to the actual data that came IN to their web application. So the ISPs aren't going to install PHP 6 because their users are going to be screaming at them that it broke their applications. Or they'll all install it with this goofy non-Unicode mode, in which case, there's not much point to them having installed it, and y'all will be effectively maintaining 3 branches: PHP 5 PHP 6 ASCII PHP 6 Unicode Unless you drop PHP 6 ASCII, in which case even fewer will bother to install PHP 6, not even in unicode.semantics off mode. Seems to me we're painted into a corner where the number of people who actually install PHP 6 is going to be abysmally small... But maybe I'm just being pessimistic. -- Some people have a "gift" link here. Know what I want? I want you to buy a CD from some indie artist. http://cdbaby.com/browse/from/lynch Yeah, I get a buck. So? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php