Re: [sqlite] Things you shouldn't assume when you store names

Igor Tandetnik Mon, 11 Nov 2019 08:03:25 -0800

On 11/11/2019 10:49 AM, Jose Isaias Cabrera wrote:

So, yes, it's bulky, but, if you want to count characters in languages such as 
Arabic, Hebrew, Chinese, Japanese, etc., the easiest way is to convert that 
string to UTF32, and do a string count of that UTF32 variable.


Between ligatures and combining diacritics, the number of Unicode codepoints in 
a string has little practical meaning. E.g. it is not necessarily correlated 
with the width of the string as displayed on the screen or on paper; or with 
the number of graphemes a human would say the string contains, if asked.

Most people have to figure out what Unicode they are using, count the bytes, 
divide by... and on, and on.  Not me, I just take that UTF8, or UTF16 string, 
convert it to UTF32, and do a count.


And then what do you do with that count? What do you use it for?
--
Igor Tandetnik

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Things you shouldn't assume when you store names

Reply via email to