On Thu, 22 Aug 2013 13:36:00 +0100
Simon Slavin <slav...@bigfraud.org> wrote:

> > 1.  where COLLATE( x AS NOCASE )
> >             between COLLATE( 'a' AS NOCASE )
> >             and COLLATE( 'b' AS NOCASE )
> 
> Again, you are trying to apply a collation to a value.  And because
> BETWEEN takes three operands you are being given the opportunity to
> apply three collations for one condition.  Which is nonsense.  So the
> syntax shouldn't allow it or even suggest it would work.  You should
> be casting the 'BETWEEN', not the operands you're supplying to it.

There are two issues:

1.  Is collation a property of the value, its type, or the operation? 
2.  Should the implied collation as currently implemented be retained,
or should it be replaced?  

Before we can consider what syntax to use, we have to agree on what the
semantics are, on what is means to compare two things.  So let's
dispense with item #2 until we agree on #1.  

Collation is not a property of the operation.  There is no other kind
of '=' than equality.  Equal is absolute; there is no modifier for it.  

I understand your temptation.  I've used strcasecmp(3) a few times
myself.  But when you realize there's no such thing in SQLite or in the
relational model as an "operator modifier", you'll see the light.  

Collation is not a property of the value.  Strings are just strings.
They have an *encoding*, necessarily, but not a collation.  

Collation therefore must, by process of elimination, be a property of
the type.  And a good thing, too, because it's a property of the
column, and a column is a type.  

Forget indexes; they're a feature of the implementation.  Think
logically, *semantically*: when are two things equal?  Two values in a
column either are or are not equal.  What determines what "equal"
*means*?  The type!  

The question of whether 

        A = B 

is true or false rests *not* on "how they are compared" but on *what*
they are.  The same is true for 

        'abc' = NAME 

To make that obvious, let's take a little mathematical excursion.  

Ask yourself how this is evaluated:

        where 1 = '1'

SQL is logical (or so we like to think), and we're being asked to
compare a number to a string, an apple to an orange.  What to do?  

There are two ways to treat that: report a type error, or perform a
type conversion.  Currently, as it happens, SQLite does something
rather surprising: 

sqlite> select 1 = '1', 1 * '1', 1 - '-1', 1 + '1', 1 / '1', 0 = '1';
1 = '1'     1 * '1'     1 - '-1'    1 + '1'     1 / '1'     0 = '1'   
----------  ----------  ----------  ----------  ----------  ----------
0           1           2           2           1           0         

As explained in http://www.sqlite.org/datatype3.html, "mathematical"
operators have different rules from "comparison" operators.
Mathematical binary operators demand two operands of the same type, and
the conversion rule is that they "cast both operands to the NUMERIC
storage class".  No such rule exists for logical operators; afaict the
comparison of strings to numbers is specified.  On the evidence, no
string is equal to any number, and no implicit conversion is defined.  

Note well: the operand didn't reach out and bang the operator on the
head to modify its behavior.  The operator converted the operand to a
comparable type.  Or not, as the case may be.  

All to say something simple but not obvious: the rules of equality --
of any comparision -- are governed by the type being compared.  

Now keep that in mind as you think about strings and their collations.
when we evaluate:

        where 'abc' = NAME

we must know:

1.  The collation of NAME (not of the values in NAME)
2.  The collation of 'abc'
3.  The rules of conversion

Dr. Hipp told us, "If X does not have a specified collation, then the
collation of Y is used instead", which I would express as, "if only one
of two types has the default collation, it is converted to the other
type".  Collations aren't "used"; they're *honored*!  

By that rule, if NAME has the default collation, the two operands are
of the same type and can be compared.  Else, if NAME's type uses a
collation other than the default, 'abc' is converted to that type and
the comparison is performed.  

That is the easiest way -- and, by the grace of Codd, the right way --
to think about it.  Get your types in order, and then compare.  It
works for any comparison, and any type, and our friend equality remains
unbothered.  And unmodified.  

Regards, 

--jkl
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to