Re: UTF-8 in PHP?

Meir Kriheli Thu, 28 Feb 2008 11:10:40 -0800

Dotan Cohen wrote:

On 28/02/2008, shimi <[EMAIL PROTECTED]> wrote:

It can be a UTF-8 problem in general - PHP has many functions that are not
UTF-8 aware, which is why we have the mbstring functions... which are
equivalent to historical PHP functions, but work well on multibyte
strings... there's even an option to overload the mbstring functions on top
of the old functions, see:
http://il.php.net/manual/en/ref.mbstring.php#mbstring.overload


However, I can't see an mbstring equivalent for preg_replace (while
ereg_replace does have one...) - which might suggest one of two options: a)
preg_replace is utf-8 ready or b) mbstring functionality doesn't support a
function for preg_replace... I know this might not be a too helpful comment,
but I tried my best...


Thanks, Shimi. It seems that preg_replace does not work on multibyte
(utf-8) strings because that would be too slow. I'm looking for an
alternative, and you may have just found it. 
Thanks.http://blog.page2rss.com/2007/01/postgresql-vs-mysql-performance.html

Dotan Cohen

It's not that, since preg_replace has a modifier for utf-8 (u). Theproblem seems to be detecting the boundaries (\b). Since (a simpler andnot perfect or similar functionality, e.g: not working on line endings)the following works:


$test=preg_replace('/([^\s]+)כ(\W)/Uu', '$1ך‎$2', $test);

Cheers
--
Meir Kriheli

================================================================To unsubscribe, 
send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]

Re: UTF-8 in PHP?

Reply via email to