> On 29 Jan 2020, at 14:59, Richard Hipp <drh at sqlite.org> wrote:
> 
> On 1/29/20, Markus Winand <markus.winand at winand.at> wrote:
>> Hi!
>> 
>> I think there might be a glitch in the way SQLite 3.31.x derives the
>> collation information from the expression of a generated column.
> 
> I think the current behavior is correct.
> 
> If you want a column to have a non-standard collating sequence, you
> should add a COLLATE constraint to that column definition.  The fact
> that there is a COLLATE operator on the expression that determines the
> value of that column seems irrelevant.
> 
> Consider this:
> 
> CREATE TABLE t1(a TEXT DEFAULT('xyzzy' COLLATE nocase));
> 
> Would you expect the COLLATE operator in the DEFAULT clause to change
> the collating sequence associated with column a?  Why should a
> GENERATED ALWAYS AS constraint work differently from a DEFAULT
> constraint?

The critical point here is that SQLite accepts a GENERATED clause without 
explicit type:

CREATE TABLE tmp (str VARCHAR(255), str_nc1 GENERATED ALWAYS AS (str COLLATE 
NOCASE), str_nc2 GENERATED ALWAYS AS (str) COLLATE NOCASE);

In that case, the type of the generated column should taken from the result 
type of the expression[0]. As collations are part of the character string 
types[1], also the collation is taken from the expression. The COLLATE clause 
applied to expressions (as opposed to following the name of a data type) just 
changes the collation in of the character string type of that expression[2].

If the GENERATED clause explicitly sets a data type, then the collation of the 
expression is irrelevant as it doesn’t affect the type of the generated column, 
similar to your example with DEFAULT (it is only a value assignment in that 
case).

At the end everything boils down to this question:
What is the type, including the character set and collation, of generated 
columns that don’t specify a type explicitly? I think the only sensible answer 
is that it is the type of the expression, including its character set and 
collation.

Unfortunately, the “what would PostgreSQL do” approach doesn’t provide guidance 
here as PostgreSQL requires an explicit type for generated columns (and so do 
MySQL and MariaDB). In SQL Server, however, it works like I would expect it.

I guess the SQLite approach to types might affect all of my reasoning, but in 
other cases like the following it seems that SQLite also “transports” the 
applicable collation together with the result (type?) of an expression:

sqlite> CREATE TABLE tmp (str VARCHAR(255));
sqlite> INSERT INTO tmp VALUES ('a'), ('A'), ('b'), ('B');
sqlite> SELECT * FROM (SELECT str COLLATE NOCASE FROM tmp) ORDER BY str;
a
A
b
B

-markus

References to SQL:2016, Part 2. I know that this is not a strong argument to 
you, but nevertheless ;)

[0] 11.4 SR 13 c: 
        • If <generation clause> is specified, then the declared type of GE. 
 Whereas GE is defined in SR 10 a:
        • Let GE be the <generation expression> contained in GC.

[1] 4.2.1
   A character string type descriptor contains:
   [….]
        • —  The catalog name, schema name, and collation name of the collation 
of the character string type.

[2] 6.31 SR 4a:
        • If <collate clause> is specified, then the declared type of the 
<character factor> is the declared type of the <character primary>, except that 
the declared type collation is the collation identified by <collate clause>, 
and its collation derivation is explicit.


> 
> -- 
> D. Richard Hipp
> drh at sqlite.org
> 

_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to