This seems to work OK as a sqlite function.
// assume values[0] & [1] are supplied and not null
// find Count of values[1] in values[0]
char *c = (char *)sqlite3_value_text(values[0]);
char *Sep = (char *)sqlite3_value_text(values[1]);
int Byte1, Count=0, NrBytes, NrSepBytes = strlen(Sep);
while (*c)
{
Byte1 = (*c) >> 4;
if ((Byte1 & 8) == 0) NrBytes = 1;
else if (Byte1 & 1) NrBytes = 4;
else if (Byte1 & 2) NrBytes = 3;
else NrBytes = 2; // (Byte1 & 4) == 4
if (NrBytes == NrSepBytes && memcmp(c, Sep, NrBytes) == 0) Count++; //
at first byte of Sep
c += NrBytes;
}
sqlite3_result_int(ctx, Count);
________________________________
From: sqlite-users <[email protected]> on behalf of
Scott Robison <[email protected]>
Sent: Friday, April 12, 2019 8:40:19 PM
To: SQLite mailing list
Subject: Re: [sqlite] Help with sqlite3_value_text
On Fri, Apr 12, 2019, 1:06 PM Keith Medcalf <[email protected]> wrote:
>
> Actually you would have to convert the strings to UCS-4. UTF-16 is a
> variable-length encoding. An actual "unicode character" is (at this
> present moment in time, though perhaps not tomorrow) 4 bytes (64-bits).
>
That is some impressive compression! :)
Regardless, even if you use UCS-4, you still have the issue of combining
characters. Unicode is complex as had been observed.
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users