On Mon, Feb 8, 2016 at 5:43 PM, Dominique Devienne <ddevienne at gmail.com> wrote:
> sqlite> select c?, length(c?), length(cast(c? as blob)), unicode(c?) from > t?; > c?|length(c?)|length(cast(c? as blob))|unicode(c?) > ??|1|2|252 > ?|1|1|129 > > sqlite> .schema >> > CREATE TABLE t? (c?); >> > What's surprising is that the second row/value is text, and it's supposed to be UTF-8 encoded, yet as far as I know, if not value UTF-8 (0x81 is 129, which should be encoded on two bytes). sqlite> select c?, length(c?), length(cast(c? as blob)), unicode(c?), typeof(c?), quote(cast(c? as blob)) from t?; c?|length(c?)|length(cast(c? as blob))|unicode(c?)|typeof(c?)|quote(cast(c? as blob)) ??|1|2|252|text|X'C3BC' ?|1|1|129|text|X'81' OK, I retried with latest sqlite3.exe, and the results are different: C:\Users\DDevienne>sqlite3 SQLite version 3.10.2 2016-01-20 15:27:19 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> create table t? (c?); sqlite> insert into t? (c?) values (char(252)); sqlite> insert into t? (c?) values ('?'); sqlite> .header on sqlite> select c?, length(c?), length(cast(c? as blob)), unicode(c?), typeof(c?), quote(cast(c? as blob)) from t?; c?|length(c?)|length(cast(c? as blob))|unicode(c?)|typeof(c?)|quote(cast(c? as blob)) ?|1|2|252|text|X'C3BC' ?|1|2|129|text|X'C281' sqlite> but still not quite the same, despite the inserted characters being logically the same (I think). x'C3BC' is the correct encoding of lower-case umlaut in UTF-8. x'C281' OTOH is the UTF-8 encoding of char(129), so the code page was not taken into account (I think), and the byte taken "as-is". No? --DD