On 4/12/2019 10:51 AM, x wrote:
I’m still confused by utf strings. For simplicity, suppose I set up an sqlite
function that takes a single string parameter and I want to scan the string to
count the number of occurrences of a certain character . If I knew the string
was made up entirely of ascii chars I’d do this
char *c = &sqlite3_value_text(0)[0];
int count=0;
while (*c) if (*c++ == SearchChar) count++;
How do I do the same thing if the string param is a utf-8 or utf-16 string and
the SearchChar is a Unicode character?
The problem you need to solve is "count occurrences of a substring in a
string". The substring in question could consist of one byte representing a single
ASCII character, or a sequence of bytes comprising a UTF-8 encoding of one Unicode
character. This really has nothing to do with SQLite.
I’m confused by the fact that Unicode characters are not a fixed number of
bytes so if I do this e.g.
wchar_t *c = (wchar_t*) sqlite3_value_text(0);
That's just wrong. sqlite3_value_text does *not* return a pointer to a sequence
of wchar_t. Any attempt to actually use `c` pointer would exhibit undefined
behavior.
does this mean a complete temporary copy of the value of sqlite3_value_text(0)
has to be constructed by the compiler such that all characters of the newly
constructed string are fixed width? If so, I’m just wanting to check if there’s
a way of avoiding this overhead.
You seem to ascribe some magical properties to a cast. Nothing is "constructed" by it -
it simply tells the compiler "take this pointer to a memory block, and believe that it
contains something different than what the type of the original pointer suggests; trust me, I know
what I'm doing".
If you prefer UTF-16 encoding over UTF-8, there's sqlite3_value_text16 for that.
If you are unsure what UTF-8 and UTF-16 mean, see
https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
--
Igor Tandetnik
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users