Monte Goulding wrote:

>> On 23 Jun 2017, at 7:18 am, Richard Gaskin wrote:
>>
>> is that true that UTF-16 gives us two-bytes per char across the
>> board?
>
> That’s true (the 16 means 16 bit) but internally strings may be either
> native 8 bit or unicode 16 bit.

How can we know which is in use for a given string?

Suppose I wanted to process a lot of text, so performance is critical. Using bytes would be optimal, since any chunk type or even Unicode characters may vary in length.

So if I wanted to create an index of byte offsets into a large chunk of text, how would I know how long a character is?

--
 Richard Gaskin
 Fourth World Systems
 Software Design and Development for the Desktop, Mobile, and the Web
 ____________________________________________________________________
 ambassa...@fourthworld.com                http://www.FourthWorld.com

_______________________________________________
use-livecode mailing list
use-livecode@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-livecode

Reply via email to