On Mon, Feb 8, 2016 at 5:43 PM, Dominique Devienne <ddevienne at gmail.com>
wrote:

> sqlite> select c?, length(c?), length(cast(c? as blob)), unicode(c?) from
> t?;
> c?|length(c?)|length(cast(c? as blob))|unicode(c?)
> ??|1|2|252
> ?|1|1|129
>
> sqlite> .schema
>>
> CREATE TABLE t? (c?);
>>
>
What's surprising is that the second row/value is text, and it's supposed
to be UTF-8 encoded,
yet as far as I know, if not value UTF-8 (0x81 is 129, which should be
encoded on two bytes).

sqlite> select c?, length(c?), length(cast(c? as blob)), unicode(c?),
typeof(c?), quote(cast(c? as blob)) from t?;
c?|length(c?)|length(cast(c? as blob))|unicode(c?)|typeof(c?)|quote(cast(c?
as blob))
??|1|2|252|text|X'C3BC'
?|1|1|129|text|X'81'

OK, I retried with latest sqlite3.exe, and the results are different:

C:\Users\DDevienne>sqlite3
SQLite version 3.10.2 2016-01-20 15:27:19
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> create table t? (c?);
sqlite> insert into t? (c?) values (char(252));
sqlite> insert into t? (c?) values ('?');
sqlite> .header on
sqlite> select c?, length(c?), length(cast(c? as blob)), unicode(c?),
typeof(c?), quote(cast(c? as blob)) from t?;
c?|length(c?)|length(cast(c? as blob))|unicode(c?)|typeof(c?)|quote(cast(c?
as blob))
?|1|2|252|text|X'C3BC'
?|1|2|129|text|X'C281'
sqlite>

but still not quite the same, despite the inserted characters being
logically the same (I think).
x'C3BC' is the correct encoding of lower-case umlaut in UTF-8.
x'C281' OTOH is the UTF-8 encoding of char(129), so the code page was not
taken into account (I think), and the byte taken "as-is".

No? --DD

Reply via email to