On 11/11/2019 10:49 AM, Jose Isaias Cabrera wrote:
So, yes, it's bulky, but, if you want to count characters in languages such as Arabic, Hebrew, Chinese, Japanese, etc., the easiest way is to convert that string to UTF32, and do a string count of that UTF32 variable.
Between ligatures and combining diacritics, the number of Unicode codepoints in a string has little practical meaning. E.g. it is not necessarily correlated with the width of the string as displayed on the screen or on paper; or with the number of graphemes a human would say the string contains, if asked.
Most people have to figure out what Unicode they are using, count the bytes, divide by... and on, and on. Not me, I just take that UTF8, or UTF16 string, convert it to UTF32, and do a count.
And then what do you do with that count? What do you use it for? -- Igor Tandetnik _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users