Re: [Lazarus] dynamic string proposal

Martin Frb via Lazarus Wed, 16 Aug 2017 07:38:23 -0700

On 16/08/2017 13:37, Alexey via Lazarus wrote:

On 16.08.2017 15:30, Martin Frb via Lazarus wrote:
A char can be composed of several combining code points (each of themafaik, in the 32 bit range).So a char can have 96 or more bits. (And not all of them have acombined form).
See my prev post: i see that each S[i] good to be like QWord(sizeof(one char)= sizeof(Qword)). It can be TextChar. And type can beTextString. internally it can be compressed to utf8. TextString isgood if i want to parse text by "chars". If "char" needs more bytes-lets take more (internally it is same utf8)

Have a look athttps://www.reddit.com/r/Unicode/comments/4yie0a/tallest_longest_unicode_character/


There is ONE character, that comprises more than 200 codepoints.
Only way to store such a char is in a type of dynamic size (aka string)

Well I couldn't find an official doc what makes the boundaries of a char.

But as far as I can see: if ä is one character, and it can be encoded as"none combining codepoint" + "combining codepoint", then a character isany sequence of one "none combining codepoint" + zero or more "combiningcodepoints" (AFAIK Arabic scripts has chars, that have several"combining codepoints", so this is happening in actual languages.


The example as far as I checked fulfils this definition.

--
_______________________________________________
Lazarus mailing list
Lazarus@lists.lazarus-ide.org
https://lists.lazarus-ide.org/listinfo/lazarus

Re: [Lazarus] dynamic string proposal

Reply via email to