On 11/11/2019 10:49 AM, Jose Isaias Cabrera wrote:
So, yes, it's bulky, but, if you want to count characters in languages such as 
Arabic, Hebrew, Chinese, Japanese, etc., the easiest way is to convert that 
string to UTF32, and do a string count of that UTF32 variable.

Between ligatures and combining diacritics, the number of Unicode codepoints in 
a string has little practical meaning. E.g. it is not necessarily correlated 
with the width of the string as displayed on the screen or on paper; or with 
the number of graphemes a human would say the string contains, if asked.

Most people have to figure out what Unicode they are using, count the bytes, 
divide by... and on, and on.  Not me, I just take that UTF8, or UTF16 string, 
convert it to UTF32, and do a count.

And then what do you do with that count? What do you use it for?
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to