On 4/12/2019 10:51 AM, x wrote:
I’m still confused by utf strings. For simplicity, suppose I set up an sqlite 
function that takes a single string parameter and I want to scan the string to 
count the number of occurrences of a certain character . If I knew the string 
was made up entirely of ascii chars I’d do this

char *c = &sqlite3_value_text(0)[0];
int count=0;
while (*c) if (*c++ == SearchChar) count++;

How do I do the same thing if the string param is a utf-8 or utf-16 string and 
the SearchChar is a Unicode character?

The problem you need to solve is "count occurrences of a substring in a 
string". The substring in question could consist of one byte representing a single 
ASCII character, or a sequence of bytes comprising a UTF-8 encoding of one Unicode 
character. This really has nothing to do with SQLite.

I’m confused by the fact that Unicode characters are not a fixed number of 
bytes so if I do this e.g.

wchar_t *c = (wchar_t*) sqlite3_value_text(0);

That's just wrong. sqlite3_value_text does *not* return a pointer to a sequence 
of wchar_t. Any attempt to actually use `c` pointer would exhibit undefined 
behavior.

does this mean a complete temporary copy of the value of sqlite3_value_text(0) 
has to be constructed by the compiler such that all characters of the newly 
constructed string are fixed width? If so, I’m just wanting to check if there’s 
a way  of avoiding this overhead.

You seem to ascribe some magical properties to a cast. Nothing is "constructed" by it - 
it simply tells the compiler "take this pointer to a memory block, and believe that it 
contains something different than what the type of the original pointer suggests; trust me, I know 
what I'm doing".

If you prefer UTF-16 encoding over UTF-8, there's sqlite3_value_text16 for that.

If you are unsure what UTF-8 and UTF-16 mean, see

https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/

--
Igor Tandetnik


_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to