En/na Marco Ciampa ha escrit:
On Fri, Oct 05, 2007 at 01:14:23PM +0200, Luca Olivetti wrote:
En/na [EMAIL PROTECTED] ha escrit:

* WideString allows indexed "[]" accessing individual chars.
This does not seem to be correct. I read that utf16 can be 4 byte long.. Then calculation is needed sometimes...
Unless you're dealing with klingon and ancient languages,
Like Chinese? Just a billion people use it...not a real problem at all...
:-\

I (wrongly) thought that chines was in the bmp :-(


I think you can assume that for 99.99% of currently spoken languages every
character will be exactly 2 bytes long.
Wrong as I said before.

There's a risk of having some character with more that 2 bytes but it is a small risk. With utf-8 the risk is bigger, so you have always to traverse the string if you need access to a specific character index.
You have to go through the string for UTF-8 and UTF-16 encodings so the advantages are at least questionable...

Yes, but my (wrong) premise is that you could assume all characters are 2 bytes wide, so the Nth character would be at N*2 byte.

Bye
--
Luca Olivetti
Wetron Automatización S.A. http://www.wetron.es/
Tel. +34 93 5883004      Fax +34 93 5883007

_________________________________________________________________
    To unsubscribe: mail [EMAIL PROTECTED] with
               "unsubscribe" as the Subject
  archives at http://www.lazarus.freepascal.org/mailarchives

Reply via email to