From: "Tom Emerson" <[EMAIL PROTECTED]> > But if I have a text string, and that string is encoded in UTF-16, and > I want to access Unicode character values, then I cannot index that > string in constant time. > > To find character n I have to walk all of the 16-bit values in that > string accounting for surrogates. If I use UTF-32 I don't need to do > that. This very issue came up during the discussion of how to handle > surrogates in Python. Would this not be the same issue for composite characters, even *in* UTF-32? If you truly mean to work with characters here then it seems this is a problem you can always have. MichKa Michael Kaplan Trigeminal Software, Inc. http://www.trigeminal.com/
- RE: 3rd-party cross-platform UTF-8 support Yves Arrouye
- Re: 3rd-party cross-platform UTF-8 support Marcin 'Qrczak' Kowalczyk
- Re: 3rd-party cross-platform UTF-8 support Michael \(michka\) Kaplan
- Re: 3rd-party cross-platform UTF-8 support Andy Heninger
- Re: 3rd-party cross-platform UTF-8 support Andy Heninger
- Re: 3rd-party cross-platform UTF-8 support Andy Heninger
- Re: 3rd-party cross-platform UTF-8 support Andy Heninger
- Re: 3rd-party cross-platform UTF-8 support Tom Emerson
- RE: 3rd-party cross-platform UTF-8 support Carl W. Brown
- RE: 3rd-party cross-platform UTF-8 support Tom Emerson
- Re: 3rd-party cross-platform UTF-8 support Michael \(michka\) Kaplan
- Re: 3rd-party cross-platform UTF-8 support Tom Emerson