Re: [PHP] substr and UTF-8

2006-08-30 Thread Michael B Allen
On Wed, 30 Aug 2006 21:46:18 +0700 "Peter Lauri" <[EMAIL PROTECTED]> wrote: > function is_utf8_start($b) { > return (($b & 0x80) == 0) || ($b & 0x40); > } > [/snip] > > :) I think I will go with the mb_substr function, it works for me :) Yeah, I guess that's the right thing to do. Othe

RE: [PHP] substr and UTF-8

2006-08-30 Thread Peter Lauri
[snip] Actually this is false. I don't know what I was thinking. The high bit will be set in all bytes of a UTF-8 byte sequence. If it's not it's an ASCII character. The bytes are actually layed out as follows [1]: U- ___ U-007F: 0xxx U-0080 ___ U-07FF: 110x

Re: [PHP] substr and UTF-8

2006-08-30 Thread Michael B Allen
On Wed, 30 Aug 2006 10:08:36 -0400 Michael B Allen <[EMAIL PROTECTED]> wrote: > On Wed, 30 Aug 2006 18:34:20 +0700 > "Peter Lauri" <[EMAIL PROTECTED]> wrote: > > > Hi group, > > > > I want to limit the number of characters that are shown in a script. The > > characters happen to be Thai, and the

Re: [PHP] substr and UTF-8

2006-08-30 Thread Michael B Allen
On Wed, 30 Aug 2006 18:34:20 +0700 "Peter Lauri" <[EMAIL PROTECTED]> wrote: > Hi group, > > I want to limit the number of characters that are shown in a script. The > characters happen to be Thai, and the page is encoded in UTF-8. Everything > works, except when I want to cut the text (just take

Re: [PHP] substr and UTF-8

2006-08-30 Thread Jochem Maas
Peter Lauri wrote: > Hi group, > > I want to limit the number of characters that are shown in a script. The > characters happen to be Thai, and the page is encoded in UTF-8. Everything > works, except when I want to cut the text (just take start of string). > > I do: > > echo substr($thaistring,